Stop terminating Rust/Actix-web application on error

If in PHP I code an error and a user triggers it, that user gets a blank page but the website keeps working fine for everyone else. But it already happened multiple times that my Rust/Actix-web application just terminates when a user triggers an error! I need a way to 'isolate' these sessions and solve this issue!

The whole damn site cannot be terminated if one user triggers an error!

That is not the normal behavior. Tokio (which actix-web uses) will catch panics in tasks, which prevents them from bringing down the process.

Perhaps you could say more about what kind of error is happening? Maybe you have a double panic, where the thread panics during a panic. This is pretty rare, but will bring down the process.

1 Like

It's a idiot error. I've used fetch_one instead of fetch_optional somewhere.

        let result = sqlx::query(&sql)
            .bind(&where_value)
            .fetch_one(&mysql.conn)
            .await;

The error is "No rows returned by a query that expected to return at least one row". It's easy to fix, but my website shouldn't terminate for this.

Indeed. Under normal circumstances, that would not abort your process. It's just a panic.

Are you able to post the error that got printed when it failed?

Have you double checked that you are not compiling the program with panic=abort?

1 Like

Are you, perhaps, calling the database outside of a request's lifecycle?

Edit: Nvm. just saw that this is being triggered when a user performs an action.

As others have mentioned, this shouldn't happen in normal circumstances. Are you sure the whole server goes down, instead of dropping one connection?

Maybe you're compiling code with panic=abort setting? Do you have a reverse proxy in front of the server that could prematurely assume it's dead? (e.g. nginx can temporarily stop routing traffic if it sees too many errors from a backend).

But regardless of that, it may be a good idea to run the server under a supervisor that will restart it automatically. For example, make a systemd unit for it.

I am indeed using Nginx. So the problem is with the reverse proxy? The error displayed was 502.

Even if panics are caught per-task, the app-state could be corrupted in a way that all subsequent requests also fail. Often this happens if the panic poisons locks or mutexes.

1 Like

In that case set max_fails either to 0 or some high value and fail_timeout to 1.

2 Likes

I have added the following to my nginx.conf:

upstream my_project_name {
     server 127.0.0.1:8080 max_fails=0 fail_timeout=1s;
}

That should do it? And is it a good idea to run a second instance of my Rust-app on e.g port 8081 and add server 127.0.0.1:8081 backup?

This should do it. Second instance isn't necessary.

To come back on this. I for example have an GET-route /api/orders/{id}. I retrieve the id from the URL by using this line of code:

    let order_id = req.match_info().get("id").unwrap().parse::<u64>().unwrap();

Agreed that this is bad code. Requesting /api/orders/fjweopwe returns the following error:

thread 'actix-rt|system:0|arbiter:0' panicked at 'called `Result::unwrap()` on an `Err` value: ParseIntError { kind: InvalidDigit }', src/route_handlers/api.rs:50:70

This crashes my website with a 502 for a few seconds (for ALL users). Luckily now it auto restarts after a few seconds.

And yes I should handle those unwrap()'s better. But still, shouldn't Actix somehow catch these panicks??

Solution:
In my nginx.conf I used localhost:8080... Using 127.0.0.1:8080 solved the issue...

It does. Any panic will terminate the thread it was running on in your Actix-web server.