I'm a relatively new Rust programmer (< 6 months); I've mostly programmed in GC languages before...
To avoid XY issues, here's my problem. I was refactoring a function at $WORK. The point of the function is to process a CSV, using the Reader
object in the csv
crate. I've posted a simplified version of what the code looked like here:
The key point is that the function takes in a Reader<Read + Seek>
. It does that so that it has access to the seek()
method on the reader, which is only defined if the inner reader can seek()
.
So far, so good. But my refactoring task is to try and make it so that the function can also process GZipped CSVs. OK, no problem, there's a great crate, flate2
, for handling GZipped data. Oh, but the handle flate2
returns is Read
, but not Seek
. Reading up on this makes it seem unlikely that you could make a Seek
implementation of gzip data, which TBH makes sense to me intuitively. So, bummer.
But wait--the only reason my function does a seek()
is so that it can read in the number of records, then go back to the start of the file and read in the records sequentially (if anyone is interested, the reason it is doing this is so that it preallocate some vectors). So I don't really need the ability to seek()
--I just need to be able to get the number of records in the file, and then go through the records.
Ok, I say, this is easy--just read the file twice. Once to count the records, then once to process them. How do I do that? Well, my thought is, change the function to accept something that can generate the reader more than once. I'll generate one reader to count the rows, then a second one for the actual processing.
My instinct, then, is to have my function take in a closure that returns a reader. One problem: what is the type of this thing? I tried stuff like
read_csv_twice(
g: &dyn Fn() -> Reader<Read>
)
but I just got errors about unknown sizes. Which makes sense to me; I've read the Rust book chapters about trait objects and that, and I understand why the compiler won't allow that. I just can't tell if there is a way to actually do what I want, or if this kind of "generates a thing of this type" is not a thing that Rust allows.
Anyway, I solved the issue by making enums which handle all the types of things I might want to read. My solution looks like this:
But I'm wondering if this is the best way to solve this.