Please convince yourself that an implementation of Read2
for std::fs::File
cannot have fewer copies than what the Read
trait does. The type signatures alone already tell me that, and indeed, they directly imply that any implementation of Read2
for std::fs::File
that uses standard read
calls must necessarily maintain an internal buffer. This is what std::io::BufReader
does for any implementation of Read
, but Read
does not require the use of an internal buffer and is thus more flexible.
You fundamentally misunderstand the purpose of the buffer rolling. It has to do with line oriented searching, context handling and limiting the use of heap memory. In ripgrep's implementation, there are exactly as many copies as would be done if it used your Read2
implementation. Moreover, ripgrep's implementation permits the amortization of allocation, which is critical, and it's not obvious to me how that would be done with your Read2
trait.
More generally, your Read2
trait assumes the use of an internal buffer. ripgrep's searching requires not only the use of an internal buffer, but one that can be extended dynamically based on the size of the largest line. It's not obvious to me that an implementation of Read2
would support such a use case.
Finally, that there are two implementations of ripgrep's search is a failing of mine, not of the Read
trait. The implementations have been unified in my dev branch as part of factoring more of ripgrep's internals out into libraries, and I didn't need Read2
to do it. Moreover, in my dev branch, the library supports an important new feature: the ability to limit or control the amount of heap allocation being done. If I used Read2
, then I don't see how that could be implemented using the interface you've provided. It can be done with the Read
trait however because the Read
trait makes far fewer assumptions than Read2
.
This is once again false. I explained why above. This is why I suggested that we stopped communicating, because it isn't productive. The regex engine itself requires UTF-8, so regardless of what an implementation of Read2
yields---whether its u16
or otherwise---some explicit transcoding step is required.