Hi there,
I've spend the past week or so playing with Rust and reading the book
. I really like the ideals behind the language and as such I want to dig deeper. To do so I think I need a real project, one which seems like a good technical level is a binary file parser.
I have previously lived in OOP environments, with dynamic dispatch + inheritance. My struggle is around design considerations when using rust
- I can't get my head around what to do when we don't know which order things are going to happen in
This isn't really a question, I'm just stuck and looking to stimulate discussion!
Project 'Spec'
The target is STDF (structured/standard test data format) - there is more details available here, however it is essentially a record-based file structure. Each record consisting of several fields which are an ordered list of a set of types. The beginning of an example record might be:
| FIELD | TYPE | NOTE |
|---------|------|------------------------|
| REC_TYP | U16 | Type of record |
| REC_SUB | U16 | Sub-type of record |
| REC_LEN | U32 | Number bytes in record |
| LOT_ID | Cn | ... |
So each record starts with an indication of the type (REC_TYP
+ REC_SUB
) and the record size (REC_LEN
) - it then proceeds with the binary data laid out as per the specification.
From a parsing perspective, the records can be in any order and we do not know ahead of time how many records of each type there may be.
Considerations (from me)
-
I could define Structs for each type (
Cn
,U32
, ...) which implement aRead
/Write
trait -
I could define the Record Struct with a Vec of 'types'
- To read a record I could cycle through the Vec and do the read/write
-
I could alternatively have Struct for each REC_TYP, REC_SUB combination (these are called MIR, PIR, PRR, etc.)
- Each struct would implement a read/write trait which would do the read/write operation on each element of the struct
- This seems like a lot of repetition
-
Generally trying to think of a top-level API for a library, which will be useful for if someone wants to implement an application to; convert to something human readable (CSV, ...), or pull certain aspects into a database, or create a wrapper to something like
pandas
inPython
.
Struggles
-
What do I store the output as without knowledge of record types ahead of time?
- A common use case might;
- Parse the file
- Get all the serial numbers from PRR records
- Get all the data from PIR records
- A common use case might;
-
My head hurts