Using Unix signals to force a time-out on system calls

Hello everyone!

I'd like to ask your advice on using Unix signals in order to limit the amount of time the system call is blocking the thread. The problem is, that while some blocking system calls (like select()) provide a way to set a time-out, other ones (like flock()) do not, they only have a blocking (without a timeout) and a non-blocking mode. For example, it would be nice to be able to wait to acquire an advisory lock on a file for some time, but flock() only allows waiting indefinitely or not waiting at all (it's the same with POSIX record locks and open file description locks). I've studied the source code of the flock command-line utility, and found out that they use a timer_create() API in order to get the system call interrupted at the desired time. I've written an example implementation in Rust. What do you think?

Note: the nix::sys::signal::Signal enum does not support real-time signals (I want for example to use SIGRTMIN+3). I had to work around this using std::mem::transmute().

One problem with this is that it isn't (easily) usable in a library. How do you agree on which library uses which signal for example? You could have a multiplexing crate that everyone agree to use and share.

Another problem: what about threads? Signals are process wide as far as I know (or was that one of the things that differed with real time signals?)

Signals are a very hacky solution to anything, but I don't know that there is a better option for these specific syscalls. Maybe someone else has a better idea.

no, signals can be sent to a specific thread with pthread_kill, or to an entire process with kill. signal handlers are per-process, but the signal mask is also per-thread.

one option would be to create a new thread, perform the blocking operation on that thread, then on the main thread do a spinlock on JoinHandle::is_finished until the timeout is reached, at which point you can call pthread_kill or tgkill to cancel the thread. this might not be super performant, however it has the advantage that you don't need to muck about with setting signal handlers.

For flock I would suggest just polling the non blocking version as the simplest approach.

Another option that not may not work is to spawn a separate process to do the flock with a signal timeout, then try transferring the file descriptor with pidfd_getfd back to the main process.

timer_create() allows choosing a specific signal for each timer, so it's possible that the library would allow choosing a signal on per-timer basis.

Signals can be directed at the thread or the whole process. Specifying SIGEV_SIGNAL will result in the signal sent to the whole process, but for timer_create(), you can choose SIGEV_THREAD_ID to choose a specific thread to receive signals.

1 Like

That's a good alternative, but I was trying to make select()-like timeout for cases where, for example, several processes are waiting to acquire an exclusive lock. Also, I used flock() as an example, there are other system calls that have a non-blocking and blocking-without-timeout versions, but no way of setting a timeout.

Child processes inherit file descriptors, so they don't need to transfer them. And for transferring file descriptors, I prefer transferring them through a Unix-domain socket (using the passfd crate, for example), since pidfd_getfd() requires ptrace permissions on the target process.

If you have pidfd_getfd then you are on Linux and could just create a separate task using clone without use of CLONE_THREAD option.

Said task would share file descriptors and memory, you can easily achieve synchronization with futexes but it wouldn't share signal handlers thus all these issues of signal conflicts and other such problems wouldn't affect it.