Design Guidance: Custom Binary File Reader

pdunc · September 30, 2015, 9:20pm

Hi there,

I've spend the past week or so playing with Rust and reading the book. I really like the ideals behind the language and as such I want to dig deeper. To do so I think I need a real project, one which seems like a good technical level is a binary file parser.

I have previously lived in OOP environments, with dynamic dispatch + inheritance. My struggle is around design considerations when using rust - I can't get my head around what to do when we don't know which order things are going to happen in

This isn't really a question, I'm just stuck and looking to stimulate discussion!

Project 'Spec'

The target is STDF (structured/standard test data format) - there is more details available here, however it is essentially a record-based file structure. Each record consisting of several fields which are an ordered list of a set of types. The beginning of an example record might be:

| FIELD   | TYPE | NOTE                   |
|---------|------|------------------------|
| REC_TYP | U16  | Type of record         |
| REC_SUB | U16  | Sub-type of record     |
| REC_LEN | U32  | Number bytes in record |
| LOT_ID  | Cn   | ...                    |

So each record starts with an indication of the type (REC_TYP + REC_SUB) and the record size (REC_LEN) - it then proceeds with the binary data laid out as per the specification.

From a parsing perspective, the records can be in any order and we do not know ahead of time how many records of each type there may be.

Considerations (from me)

I could define Structs for each type (Cn, U32, ...) which implement a Read/Write trait
I could define the Record Struct with a Vec of 'types'
- To read a record I could cycle through the Vec and do the read/write
I could alternatively have Struct for each REC_TYP, REC_SUB combination (these are called MIR, PIR, PRR, etc.)
- Each struct would implement a read/write trait which would do the read/write operation on each element of the struct
- This seems like a lot of repetition
Generally trying to think of a top-level API for a library, which will be useful for if someone wants to implement an application to; convert to something human readable (CSV, ...), or pull certain aspects into a database, or create a wrapper to something like pandas in Python.

Struggles

What do I store the output as without knowledge of record types ahead of time?
- A common use case might;
  - Parse the file
  - Get all the serial numbers from PRR records
  - Get all the data from PIR records
My head hurts

Topic		Replies	Views
File I/O beyond strings	4	422	August 10, 2023
Reading Binary Data From File	7	17118	July 8, 2020
Code review for small-ish CLI application code review	8	871	June 30, 2022
Polymorphism for reading Source BSP data help	3	589	January 12, 2023
Read into struct help	20	8792	November 3, 2019

Design Guidance: Custom Binary File Reader

Project 'Spec'

Considerations (from me)

Struggles

Related Topics