Make multithreaded/async app more readable in htop/gdb?

Call std::process::Command

While this isn't inherently problematic (if that command is not itself indirectly waiting on anything else that your program is doing), it's a very inefficient use of rayon (unless the command is going to itself use 100% of a core until it exits). It would be best if whatever commands you need to run were started separately (perhaps using tokio's process APIs) and you don't invoke rayon except when you have some output to process (or some input to prepare).

Yeah the Commands are mostly just calling tesseract (in single thread) using rayon to parallelize the different pages of a PDF. etc

The OnceCell calls lopdf to parse the PDF, which does use rayon!!

However, I don't think that is the cause of the bug because i disabled it, unless it was also contributing to the bug, and i have something else like this somewhere else, but I don't call par_iter in many places, and they all look like they're doing very reasonable things!! :frowning:

This bug... It will progressively get worse, with the CPU eventually basically getting to 0% on all but 2 or three threads, then after a LONG while, it may clear up and suddenly boom, everything fills up and gets to 100% CPU again, sometimes, it appears, these dips are for very long periods of time, sometimes they seem to clear up again... but there are unexplainable dips...

Interesting. I think maybe just nested rayon is not a good idea, because I yanked a lot of the par_iter()s and now CPU load is a very respectable 100% all of the time. There simply weren't any of these dips any more!

1 Like

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.