Searching for UB on Windows

I really need some suggestions for a problem that is giving me some troubles. I have to premise that I generally work on Linux, therefore there could be an easy and straightforward solution. In that case, sorry for my ignorance :smile:

The program

I want a tool that runs on Windows that runs other programs and is able to close it after a certain amount of idle time.

To do so, I spawn a process using tokio_process, I inject a DLL (see below) to handle the capture of the mouse and keyboard event and I use a custom Future implementation for three simple situations:

  1. When input events are captured
  2. When the timeout is reached
  3. When the program is closed

I found that the critical situation is related to the second point, and I will talk about this later. Let’s go on.

The DLL

Here comes the magic. I want that every time a thread or a process is ATTACHed, a windows hook has to be set for WH_MOUSE and WH_KEYBOARD for the current thread, and I want to do an unhook when a DETACH happens.

One of the first thing I discovered is that I could not use thread local storage: two calls to my DllMain belongs to different threads even if GetCurrentThreadId, and the content of TLS data previously set is magically empty. Ok, who cares, let’s use lazy_static with a RwLock<HashMap<>> to store data using the GetCurrentThreadId output as identifier. (I wanted to share this because it can be useful to someone else)

Just to let you know, I noticed that when I got a process ATTACH, then a thread ATTACH and the thread id is the same for both, I must not attach another hook. But then how about the detaching process? I solved the issue creating a counting Hook struct, which, for the actual thread id, performs a hook only the first time. When its unhook fn is called, it really performs an unhook only if its counter reaches zero. Nothing special, nothing new.

I decided to share events data with the main process through UDP (I know, the overhead is huge, but for now I don’t care). So, what happens is: DllMain called with thread/process attach flag -> hook – (async) --> hook function called --> message sent through UDP. Simple enough.

The main problem and how I solved it

Hooks must be unhooked. And the problem raises when I want to terminate the process because my timeout triggers. I decided to use ONCE_INIT inside DllMain in order to spawn a thread that waits for UDP data on another port. When data is read, it locks the static container that handles all the hooks, set an atomic flag (in order to avoid a possible hook as soon I unlock the mutex; honestly I am not sure it is possible to happen, but I want to avoid other possible issues), clean all hooks and calls std::process::exit. The main process, instead of killing the child, sends a signal to the DLL in order to start the exiting process and waits for the termination. Should work, right? A sort of…

What happens

The first time I run the tool, everything seems to work flawlessly: event are sent from the dll, which causes the timeout to reset, and when events are stopped, the process shutdown. The first time I run the tool. If i run it again, it looks like the hooks are not working. Oh, forgot to say, I check the result for every winapi function, and in case of error I panic, and I have a custom panic handler that uses a messagebox to alert me of the problem. And no panic occurs. Long story short, I need to reboot my (virtual) windows in order to run again my tool :sob:

As I said before, I generally work on Linux, and for me it does not make any sense that I am able to make the OS go crazy from programs in userspace. But maybe it is me that I don’t see the point.

So, I am asking for suggestions and help. I tried to encapsulate every winapi function I use in safe and sound functions. However, the behaviour I get smells like UB from far away, and I also tried many approaches before reaching the “stop event” solution. Everything trying to understand what was going on without any possible sanitizer available (oh, msan, ubsan and valgring, I miss you soooo much). And now I am quite stuck, without any hint of this strange behaviour.

Have you already found this behaviour? Do you have any suggestion of checks I could do? Do you know any software that could help me finding possible UBs?

Sorry for the long post, but this problem is really troublesome for me :cry: