I have been looking into an issue where our system comes to a crawl under load. Attaching the flamegraph under load, this appears to be similar to the one discussed in Weird contention in tokio::time:: Clock::now. It basically ends up in the cfg_test_util! implementation of the Clock, and 70% time is waiting for the lock.
Forked the tokio github (tag=1.19.2) and added some debug to verify this is the case
Tried a temp fix where the cfg_test_util! Clock::now() just returns Instant::from_std(std::time::Instant::now()) without grabbing the lock (since the pause functionality is not needed anyways in the prod version). This does indeed solve the performance issue.
I also tried setting resolver = 2, but this didn't make a difference. This is probably because we are already using cargo 1.60.0 (d1fd9fe2c 2022-03-01), and cargo 1.50.0 was the switch over from resolver = 1, IIUC. The cargo tree -e features -i tokio output also matches.
Is something not being set up correctly to prevent the test code leakage? Please let me know if any other info is needed.
Thanks. I rechecked, the test-util feature is not specified anywhere in the code base. May be some transitive dependency, not sure. Also, I am unable to repro it in a standalone example.
I grabbed the cargo tree -e features -i tokio --charset ascii output. All the parts where the test-util shows up listed here.
Cross verified the feature tree above, all the test-util occurrences check out, except the one in line 2491. I can't quite tell where the test-util comes from. One explanation could be that it is leaked from the dev-dependencies even with resolver=2.
One option to find would be to fork tokio, delete the test-util feature, use the [patch] section to override tokio with the modified version, and then run cargo build and see what breaks.
You appear to have tower-test dependency which depends on tokio-test which depends on tokio's test-util feature. Please make sure that tower-test is specified in dev-dependencies only.
OK, found the issue. Apart from not using test deps in prod, both the edition/resolver need to be set correctly as well. The resolver part is taken care by just having the rust version >= 1.50
Not the edition part tough. I can repro the issue by switching the edition value(verified by patching the tokio to my debug fork). With edition = 2018, I can see the test deps leaking, 2021 works as expected.
But unlike resolver, edition can't be set at workspace level unfortunately