Say I have two "threads"/tasks/jobs/whatever.
An IO bound (gets data from a DB) and a CPU bound (crunches the data).
I want the IO thread to prefetch data for the CPU thread to work on.
Ideally it fetches no more than the CPU is capable of crunching, but always has enough in the reserve so that the CPU never has to wait for an IO fetch to complete.
I'm sure this must be a common problem, with theoretical solutions/algorithms, but I don't know the name of it, or couldn't find any information about it.
One strategy is to use a channel with backpressue.
Which will work... the IO task will fill up some statically determined buffer limit and then wait before progressing. But it can not adjust in realtime to the demands of the CPU task.
I wonder if there is an known dynamic algorithm (or library) to solve this problem, or if there is an optimal mathematical algorithm for determining the buffer size... I.e. i'm sure it has to do something with the average amount of time it takes the CPU task to crunch 1 unit, and the average amount of time it takes the IO to fetch 1 unit... but i'm a little too dumb to see the answer
Any advice?
Thanks!