How to accuratedly time async functions?

I have a single thread app (tokio worker_threads = 1) that calls a bunch of API endpoints using the curl crate. In a for loop, I iterate through the endpoints and call an async function send_requests(endpoint_ips), store the futures in a vector and then call join_all(futures).

Since the send_requests() is called with the list of IPs for a given endpoint, I use the curl's Multi to connect to all the given IPs simultaneously, as such:

 while multi.perform().unwrap() > 0 {
       tokio::task::yield_now().await;
    }

After that I collect the metrics from the Easy2 handles which were added to the Multi handle. For instance, connect_time()[1].

The problem with this is that the timing reported by the methods like connect_time(), pretransfer_time(), etc, seem to be getting skewed by the number of async tasks I have. For example, if I double the number of endpoints that I'm calling, the timing reported by curl will increase significantly. What I suspect to be happening is that while tokio is busy scheduling and handling other tasks/futures, the clocks for the futures it's awaiting on are still running. But I'm new to Rust, so I may be completely wrong and not doing things the right way.

Have you folks run into similar issues? Any suggestions, ideas, thoughts?

Cheers!

[1] Easy2 in curl::easy - Rust

curl isn't a rust async crate. Your while loop I expect would keep the CPU needlessly running.

Multi supplies its own poll, not really compatible with async rust.

At a glance (maybe incorrect); the timer start when Easy2 is constructed so limit their construction for being inside the limits Multi has set.

OR use one of rusts async crates instead of curl.

I couldn't find any async crate that provides me with the metrics I need, TCP handshake time, FBL, TLS handshake, etc. :frowning:

But to be honest I don't think curl to be the issue here. If I have the same logic described in my post, but instead of calling send_requests() I call another async function called that uses Tokio's async DNS resolver AsyncResolver, I get inaccurate timing:

let resolver = AsyncResolver::tokio(conf, opts);
resolver.clear_cache();
let start = Instant::now();
let lookup = resolver.lookup_ip(endpoint).await;
let end = start.elapsed();

You can use tracing spans. Instrument your async functions with #[instrument]. The span is entered when the corresponding future is polled and exited when it returns from polling.

Appreciate the reply. Can you give me an example on how to capture the execution time of the async function? I'm setting up the subscriber as such:

    let log_config = &opt.log_config;
    let log_level = EnvFilter::default().add_directive(Directive::from(log_config.log_level));
    let file_appender =
        tracing_appender::rolling::hourly(&log_config.log_directory, &log_config.log_file_name);
    let (non_blocking, _guard) = tracing_appender::non_blocking(file_appender);

    let timer = SystemTime::default();
    let subscriber = FmtSubscriber::builder()
        .with_env_filter(log_level)
        .with_ansi(false)
        .with_writer(non_blocking)
        .with_timer(timer)
        .finish();
    tracing::subscriber::set_global_default(subscriber)
        .expect("setting default subscriber failed");
    info!("Logging initialized");

And then instrumenting my future like this:

let span = info_span!("my_async_operation");
let _guard = span.enter();
let f = send_requests(endpoint, ips);
let r = f.instrument(span).await;

In my logs I do see the see the span, but no execution time. Only my application logs. Example:

Jul 25 20:58:03.587  INFO my_async_operation: foo: Sending requests to: bar

Initializing my subscriber with with_span_events seems to log time.busy and time.idle:

FmtSubscriber::builder()
        .with_env_filter(log_level)
        .with_ansi(false)
        .with_writer(non_blocking)
        .with_span_events(FmtSpan::CLOSE)
        .init();

And then calling tokio's asyncresolver's lookup_ip() like this:

    let span = span!(Level::INFO, "lookup_ip");
    let f = resolver.lookup_ip(endpoint);
    let lookup = f.instrument(span).await;

Log:

Jul 25 22:06:41.771  INFO lookup_ip: foo: close time.busy=490µs time.idle=34.7ms

Regardless of which endpoint I resolve, the time.busy is always in the microseconds, and I can confirm the dns lookup is not cached. So I don't think this value can be used as a way to knowing how much time the resolution itself took. According to the doc it means:

span’s busy time (the total time for which it was entered) and idle time (the total time that the span existed but was not entered).

The idle time measures how long your I/O took together with the time the future could have made progress but was not being executed by the executor.

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.