Discrepancy between `Instant::elapsed` time and socket timeout

I have some code that runs on a unix datagram socket, where it uses the SO_RCVTIMEO socket option to set a timeout (for context: this is only run on Linux). As part of a test to ensure that the method works, I have a test that looks like this:

const TEST_DURATION: Duration = Duration::from_millis(20);

let start_time = Instant::now();
// This method sets the SO_RCVTIMEO option, tries to receive a packet, and then unsets the option.
assert!(socket.receive_timeout(TEST_DURATION).is_none());
assert!(start_time.elapsed() >= TEST_DURATION);

The vast majority of the time, this test passes. However, on very rare occasions (empirically <0.5% chance), it reports a test failure, with the measured elapsed duration is less than the expected duration, though it's within 0.1ms.

I know all sorts of issues can happen that can make timing intervals longer than requested (hence why e.g. thread::sleep only claims "to sleep for at least the specified amount of time"), but I wouldn't expect the timing interval to be shorter than requested.

The method itself also does a couple other syscalls to set the option before receiving, and then unset it afterwards, but, if they end up taking a substantial amount of time, it should make the interval longer, not shorter, so those syscalls (and anything else the method does) shouldn't be the issue here.

Anyone know what could be going on to cause this?

I would assume that the issue is Instant being inaccurate, it's not always the most precise measurement.

It's possible that since the Instant API relies on a monotonic clock, the kernel's timeout handling might use a slightly different time source or resolution. You could subtract a small tolerance to the required elapse duration.

1 Like

If that's true, is there a more accurate timer I can use?

If it does use a different clock, what's the difference between them? I'd happily subtract a tolerance if I new what the difference could be, but I don't (it's usually within 0.05ms, sometimes 0.05-0.1ms, but how do I know larger differences aren't also going to happen, just too rarely for me to have observed them so far).

I know you said Linux, but 15.6 ms is a common granularity on Windows at least (due to timers). I would assume that timers on linux are similarly dedup'd to save power, and you're just getting unlucky on rounding sometimes.