Logging/debugging/diagnostic info beyond println / web_sys::console::log_1

At some point, the amount of 'debug/diagnostic' info exceeds what is easy to process with println / web_sys::console::log_1. What is a good say of handling this?

One crazy idea is to send all logs to a sqlite3 db, then query it with SQL.

I am interested in hearing other solutions to the "finding a needle in the println / web_sys::console::log_1 haystack" problem.

Structured logging

No, it's more of a querying issue than a logging issue.

@jchimene has a point. If you want to use copious logs to find bugs and perf problems, then it helps if the logs are structured, so that they can be parsed reliably. For instance, having ENTER/EXIT (and EXN for languages that raise exceptions) log-lines with known structure (like class/method-names) can allow you to reconstruct partial call-trees, and if they contain timestamps of adequate granularity, you can use that to find performance bottlenecks. Similarly, if you log session-id information and maybe thread-id info, you can use that to "sessionize your logs".

There's a general statement here: if you add data to your log-lines in parseable format, you can use them to do detective work in your running code. There are entire companies (Splunk, probably lots of others) that use logs to help you diagnose problems.

In my experience, loading lots into an SQL DB isn't very useful, unless the analysis you want to do requires that sort of query capability. That is to say, first figure out what analysis you want to do, then organize your log-data to support it. But before you even get there, you need to know what kind of logging you should be producing, to drive those analyses.

BTW, I've been a big fan of google's glog. I see that there's a really, really minimal first-attempt at starting to implement something like it for Rust. Log-lines that you can enable at runtime are really powerful -- they're already compiled-in, and you can turn 'em on at program startup, or even afterwards based on interactive commands. Perhaps, via a web-server interface. Really, really powerful. Key to this, is to make the runtime cost of a log-line that is not enabled be as low as possible.

For simplicity, let's imagine that all log entries are of the form:

(time: u64, Tuple_N<(key, value)>) where the Tuple key/values are heterogeneous

So then, the type of debug queries I want to make are of the form:

  1. what are the latest values associated with k1, k2, k3 ?

  2. for k1, give me all (time, value) pairs

  3. find me first time instance, after t_0 where following predicate is false

Current intuition is to stuff all this into a sql db; reason being: at compile time, all the key/value types are known, so the table schemas should be known.

However, if there is a better solution, I am happy to listen.

I'm not trying to tell you not to use an SQL database. Just observing that sometimes it's the right tool, and sometimes it isn't. For instance, in your example, you're assuming

  1. index (k, time)
  2. index (time)

What I'm trying to say, is that you've already presupposed the access paths for your data. Which is the same as saying that you're building a custom data-store for the kind of problem you're trying to solve.

By contrast, what most people with massive logs do, is to build data warehouses, which are explicitly designed to allow many kinds of queries, albeit less efficiently than a data-store aimed at one class of queries.

In any case, there's one really important thing: don't put your data into anything other than logfiles, from your program. At most, something like Kafka. Then postprocess the logfile to load into whatever datastore you use for analytics. B/c the last thing you want (and I know this from experience) is for your program to hang up, b/c your datastore can't keep up. Boy howdy, that's fun.

LOL. I think the core issue here is that you are generously over estimating the scale I am operating at.

I'm at Sqlite3 scale. I'm not at data warehouse / lakehouse scale. :slight_smile:

ha! OK, point taken. In that case, can I just suggest that you should please push your logs into logfiles, and then convert to SQL ? B/c really, you don't want to live thru a "my app broke b/c my logging solution sucked". Seriously, life's too short for that.

But also: sure, once you have raw log data, post-processing can be of all sorts, and really, the sky's the limit. So an SQL db? Sure. But all sorts of other stuff is also useful and interesting.

This is my fault for not stating all this up front. Here is my logging problem:

I am building a webapp with multiple (50+) webworkers. web_sys::console::log_1 is no longer cutting it. I want to log to sqlite3 in real time so that I can query it in real time ... at the Chrome dev tools console.

Right now, I open up chrome dev tools console, there's console_logs from 50 web worker threads, I have no idea what is going on.

I want all those events stuffed into sqlite3 so that in the chrome dev tools console, I can type things like 'select * from ... where ...' and get just the events I want.

If you have parallel work from 50+ "threads", I suspect you will want to log some sort of session-id, so you can sessionize your logs. And also, I think you'll find that logging directly into sqlite will not work, unless you also configure those indices. But configuring those indices will mean that logging isn't cheap anymore, b/c it will incur Btree maintenance. So .... perhaps you might want to instead log to a file, and write a program that slurps the file into sqlite, and monitors the file for appends, slurping those appends over time? That way, you aren't blocking your program for DB inserts.

The more-general version of this is to put a log-saving mechanism like Kafka in-between, but sure, that might be overkill. A logfile is the simple version of that.

I'm really failing at explaining the problem. :slight_smile:

This is for local development purposes only -- answering questions of the form: why the $*@( is the app displaying X when I expect it to display Y.

There is no logging to disk, everything is in memory in Chrome.