I have a directory which may contain hundreds of thousands of files.
The filenames are prefixed with a priority number
Each file has a modification date.
for example:
100_randomid.bin modification date 10 seconds ago <--- the file we want
100_randomid.bin modification date 6 seconds ago
100_randomid.bin modification date 7 seconds ago
100_randomid.bin modification date 1 seconds ago
099_randomid.bin modification date 3 seconds ago
099_randomid.bin modification date 2 seconds ago
099_randomid.bin modification date 1 seconds ago
099_randomid.bin modification date 0 seconds ago
098_randomid.bin modification date 10 seconds ago
098_randomid.bin modification date 8 seconds ago
098_randomid.bin modification date 2 seconds ago
098_randomid.bin modification date 1 seconds ago
I have multiple async tasks running in multiple threads "the consumers".
Each consumer task needs to remove the file at the top of the sorted list as shown in the example.
The list of files must be sorted first on name (larger priority comes first), and then on modification date. The example above shows the sort order.
Files are being added thousands of time per second - with random priority and random modification date.
The naive solution would be for each consumer task to sort the directory into the order shown above and get the top file, but this might be happening tens of thousands of times per second so would be way too slow.
Then I thought perhaps I could have a separate task with an inotifywait watching the directory. Each time it is notified of a new file being added to or removed from the directory, it updates an index that it maintains in memory. Consumers could then ask this function for the filename at the top of the sorted list.
The question that I have is - what sort of code allows creation of such an index that would be suitable for maintaining a sort list whilst constantly adding new filenames that have random priorities and random modification dates? Is there a specialised index of some sort suited to this? Or would I have something like a Vector of Structs and simply re-sort the Vec based on the values in the struct every time a new file is added to or removed from the directory (still seems inefficient)? Is there a better way to do it?
thanks!