Noob Iterator Consumption Question

I would appreciate some help, as I seem to be running up against that dreaded learning curve.

I have a directory with some files in it, and I want to process all of the files in that directory, the results of which are to be stored in a vec. The reading part is simple enough:

    let my_files = match fs::read_dir(read_dir) {
        Ok(filedir) => filedir,
        Err(e) => panic!("got error attempting to read file dir: {:?}", e)

Now that I have my ReadDir, I want to allocate a vec with as much capacity as I have files in my directory. That itself is also a relatively simple task:

    let object_list = Vec::with_capacity(my_files.count());

Now I want to iterate over every file in my_files, but I'm being told that my_files is being used after move. Surely there must be some way to re-use iterators, but I can't seem to find the answer I'm looking for.

Calling count "does" the iteration and the iterator and consumes the iterator. The iterator returned by read_dir can only be used once, to do a single iteration over all the items. Fortunately Vecs can grow dynamically, so you often don't need to pre-allocate the thing with the correct size but can just .push new entries into it on the fly.

Also, I think there really is no way at all to know the size of the iterator beforehand anyways because iterating though it corresponds to actually reading the contents of the directory live, so you might even see some effects of changes to the directory that only happened after the iteration already started.

If you want to use more "fancy" iterator adapters, you could try using map and collect. Or for an easier start, try a for loop (for file in my_files) and use object_list.push as mentioned.


More generally, most iterators don't implement Copy even if they trivially could, because it's felt that it would be too easy to accidentally copy them so that you repeatedly look at the first element of a copied iterator, or similar. So instead, when sensible, they implement Clone. So in many situations, if needed, you can use a clone of an iterator and then use the original.

In this situation though, ReadDir (the iterator) doesn't implement Clone either. (It holds some OS-specific directory handle or such.) So if you wanted to do the iteration twice, you'd have to call std::fs::read_dir again.

However, I agree with @steffahn that doing the iteration twice is not the best approach here. Even if the file count doesn't change between iterations, you're probably going to lose more time with all those syscalls walking the filesystem twice than you save on reallocation.

1 Like

My code was already working before I decided I wanted to pre-allocate the vec. It seems like that was a bit of needless pre-optimization in this case.

Thank you both for your assistance.

1 Like

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.