It’s an interesting benchmark, but they aren’t measuring the same thing. walkdir does depth first iteration, while your program does breadth first iteration. The latter is generally going to use more memory, since it requires memory proportional to the fattest directory where as the former only requires memory proportional to the deepest directory. How exactly that impacts performance isn’t clear.
In general, walkdir should be compared against things like
ftw. Those are roughly operating in the same space with a similar set of configurable features.
Also, walkdir does error handling, and in particular, yields elements from its iterator that could not be read. When an error occurs, the file path is copied for convenient error reporting. Generally, this isn’t an issue in practice because you typically want to show those users the error message, so it isn’t wasted work. Your benchmark doesn’t quite account for this though. And in particular, a benchmark over
/ is likely to encounter lots of errors. (Look at the stderr output of
find / to get an idea.)
Unless you understand the trade offs walkdir is making for you in directory traversal, I generally would not recommend rolling your own. At some point, your users are going to want things like “follow symlinks,” or “stay on same file system” or “don’t exceed some maximum depth” or “sort my entries.” All of those things are provided by walkdir out of the box.