How do we do metrics these days?

Hey all,
I've been neglecting working with metrics and observability in general in the last couple of years but I do need those for a project now so I am looking to catch up.

I've done the reading of the tracing and metrics crates and managed to successfully fiddle with several different configurations of tracing -- liked it. Haven't practiced metrics yet for the simple reason of not having a clear view about how the whole thing fits together. I would like to remove this lack of knowledge.

To my question:

Main thing I am interested in: are there good reference / example projects that demonstrate how to collect metrics in a complete manner that I can take inspiration from? By that I mean not only Rust source but also e.g. Dockerfiles for Jaeger, Prometheus, Kafka, Grafana and what-have-you.

What I'd also love to hear from you:

  • Clues about which exact pieces of libraries or external software that I'll need. Alternatives to each of them would be very welcome.
  • Regularly used hosted solutions. I am not against them but I'd do some effort into trying to self-host first. It will still be very valuable to know how do people go about it with hosted solutions.
  • Diagrams orienting the reader into what fits where, what's the general stack, which are the mandatory and optional pieces, some pros and cons, etc.

Apologies for the rather vague and wide question. I already have done some homework but I find myself a bit lost -- so I'll appreciate any informative takes that you might offer.

Thank you for your time.

1 Like

After a little bit more research, I'd concretely be looking into integrating #[tracing::instrument] macro with sending spans to OpenTelemetry-enabled software.

Started looking into several examples I found on the net but so far no dice.

Anyone?

I’ll provide a longer explanation and guide in a few hours as my wife is graduating veterinary school today. In the meantime, this draft PR might give you a bit an overview and point you in the right direction: https://github.com/tokio-rs/website/pull/654/

Looking through the PR and linked articles right now. Would you be able to provide the explanation you alluded to?

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.