Hi all, Rust newcomer here. I am trying to find the best interface for defining composable transformations on byte streams. A decoder of a hex-encoded byte stream is a good example because it is simple but has the following features in common with more involved examples:
- A possibility of invalid input (such as characters other than 0-9,a-f or streams of odd length) necessitates non-trivial error handling.
- It is not a simple byte-for-byte mapping, several bytes (2 in this case) must be grouped together to produce the next output byte, optionally whitespace characters should be eliminated.
So far I have come up with two approaches: implement the
std::io::Read trait or the
Iterator trait. I’ll discuss readers first.
While implementing hex decoder as a reader adapter I encountered the following passage in the documentation:
If an error is returned then it must be guaranteed that no bytes were read.
That passage is a bit unclear to me. Does that apply only to transient errors? What if my reader adapter calls its underlying reader multiple times, generates some output and then encounters an error? Should it discard the error and just return the number of bytes read so far? Or is such behavior prohibited? Note that in contrast for Go readers reading some bytes and simultaneously returning an error is acceptable.
If a reader adapter conforms to “no partial failures” rule it should issue no more than one call to the underlying reader. Then the following problem appears: what if only 1 byte is read? Adapter must not return
Ok(0), as that would signify EOF. One solution that I devised was to return io error with
std::io::ErrorKind::Interrupted but that feels kind of hacky.
Implementing hex decoder as an iterator brings its own set of questions. First, to retain meaningful error information I must set
Item=Result<u8, E> for some E. But then composing iterators becomes rather cumbersome. Also there is no easy way for a caller to distinguish between recoverable errors which the one can try to iterate over from unrecoverable ones. One solution that I can think of is to set some flag after an unrecoverable error and return
None on the subsequent calls to
.next(). Is there a better approach?
Final question concerns interoperability between these two approaches. A reader can be transformed into a byte stream iterator with a call to
.bytes(). Why isn’t the reverse transformation (i.e. implementing
std::io::Read for byte stream iterators) defined in the standard library? It seems straightforward and fairly useful but I can’t do it myself since neither the trait nor the type is defined by me.