Rough numbers for rayon thread pool startup/shutdown

Librsvg (on crates.io now if you didn't notice!) currently only uses multiple cores in a few of the SVG image-processing filters (feGaussianBlur and another). Specifically, it only uses rayon::scope() and par_chunks_mut with the global thread pool.

Consider a program like gnome-shell, or a GTK application, which renders SVG icons constantly. When icons use those SVG filters, what I think happens is this:

  1. gnome-shell starts up.
  2. gnome-shell calls librsvg to render one icon.
  3. The first time that librsvg does a gaussian blur, rayon creates its global thread pool.
  4. gnome-shell keeps running and rendering icons
  5. rayon's global thread pool stays around for the duration of the program

This is fine, but people are complaining that when debugging gnome-shell crashes, they have a bunch of idle threads from rayon and it makes stack traces ugly.

I think switching librsvg to a non-global thread pool could provide some benefits: ability to terminate the thread pool, possibly let the caller configure the number of threads, or the stack size for peculiar environments like musl.

Librsvg's "transaction" is basically rendering an SVG document. I can make it create a rayon::ThreadPool when a document is loaded, and drop the thread pool when the document is freed.

I should really run some tests to get hard numbers, but would anyone have ballpark numbers for how long it takes to create and free a thread pool, assuming all the work is done when it is dropped?

I.e. I'm trying to get an estimate on the difference between these two cases:

  • gnome-shell's librsvg automatically creates the global thread pool once and never drops it, so all invocations of librsvg except for the first one use threads that are ready for work.
  • gnome-shell's librsvg creates and drops a thread pool for each invocation.

Does GTK give you a thread pool you can piggyback on, rather than using rayon?

It feels like starting a thread pool with std::thread::available_parallelism() number of threads every time you load a document is pretty expensive, or at least pretty wasteful.

That said, browsing random stack overflow answers suggested you can measure the startup time of a thread in the tens of microseconds, so maybe it's not a big deal? :man_shrugging:

I tried this out. It's like 0.5ms to 1ms for 10 threads. Looks like 0.2 for the first thread, and 0.03 per extra thread.

GTK indeed has a thread pool (actually, it's part of glib), but I'm not completely sure how to make rayon run its stuff in those threads rather than its own. I would really like to keep using rayon's abstractions for iterators and scope(). I think rayon lets one override how threads are created, but Glib's thread pool API works in terms of pushing "jobs" to it, which it then marshals to threads it manages.

1 Like

Thanks, this is useful!

Rayon has spawn_handler for this. Not sure if just adding this to the global threadpool would be sufficient to decrease debug clutter, but it might also lower the cost of spawning threads, making short-lived pools work better.

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.