How to investigate a sporadically hanging Rocket Server

Hi, I have a Rocket server + ws server that sporadically stops responding to requests. That might happen after an hour or after two weeks. I would like to investigate why that happens but don't know where to start.

I have already started using log + simplelog to write logging output to a file but I don't know in which places I should add logging messages.

It would be great if anyone has a suggestion how to investigate an issue like this.

  • I have not yet tried to connect to the ws server while the Rocket server is blocked, that is something I'll try to do when it hangs the next time.

One thing that could cause this is if some request handler has an infinite loop. I would add log statements to the start and end of every request. Then once it hangs, I would look through the log to find starts without a matching end. If you find such an unmatched start, that request handler is in an infinite loop.

You should make sure to also print end statements in the case of panics or errors in the request handler.

1 Like

Do you use any synchronisation primitives like mutexes and locks? You could add logging before you acquire a lock, just after you acquire it and as soon as you release it. And then check if the counts are equal. You can also try setting the number of threads to 1 and see if you can reproduce lockups.

If you use external services like DB you can also check whether that is the source of your app locking up.

1 Like

I found strace command really helpful in investigating things like this. You can connect to "hanged" program with -p (as root).

1 Like

Thank you for all the useful hints, I have now found out that the Websocket server indeed keeps running in the background and connections still work. That excludes the Mutex I am using as an error source.

I didn't find time to add the logging yet, but I hope to get that in place soon :slight_smile: