A few years ago I started work on a library for parsing DICOM, a binary format in which medical images are commonly encoded or transmitted. I started the project to learn Rust as well as better familiarize myself with the internals of a format I regularly work with.
I've been developing this project in a Mercurial repository hosted on an instance of Phabricator but recently I've been making decent progress and wanted to open the project on GitHub to be more accessible for review or even contribution.
I have not yet worked on any ability to extract images though that is one of the next big steps I'll be looking at. I've primarily been focused on supporting the variety of binary format and value parsing which I believe to be pretty thorough a this point. Some of the main features are
- Transfer syntax detection - little vs. big endian and implicit vs. explicit VR, detecting for both file-meta and for the dataset in the absence of file-meta.
- Sequence and item parsing including frames within PixelData and undefined length elements.
- In-memory hierarchical representation of datasets and use of a
TagPathfor querying through sequences and items.
- Example command-line tools for printing the contents of dicom files, browsing the contents of a dicom file in a TUI, scanning a directory of dicom files to find files which fail to parse, and indexing a directory of dicom files into a database.
- Parsing of the DICOM standard's xml definitions into Rust code definitions and lookup.
- Parsing DICOM without requiring the standard DICOM definitions. Despite the dictionary being available only a very small subset of definitions are necessary for parsing a binary stream. The DICOM standard dictionary is quite large, or at least my representation of it is -- The release-compiled
- Very few dependencies for parsing. Currently the parsing library only uses encoding for character set encoding/decoding and thiserror for custom errors.
The documentation and API are still something I'm working on cleaning up but I'm open to any feedback, particularly anywhere I can improve idiomatic Rust code, API, or crate layout. I make use of
cargo fix --edition-idioms,
cargo clippy, and
cargo fmt but I still occasionally find better ways of utilizing Rust's expressiveness and standard library. Just today I figured out using
Iterator::chain(once()) for inserting an item into an iterator (was previously collecting items and pushing an item), and also found a use for
I come from a background of Java but Rust's concepts of ownership and borrowing have clicked with me since I first read about it years ago so I've been excited figuring out how to put it into practice. One of the big factors that held up making progress in the past few years has been the IDE experience. It turns out after years of Java development I've built a sizable crutch on the IDE to assist me with development - primarily the ability to forgo looking up any documentation outside of the IDE, and instead having documentation in tooltips or jump-to-code in a click. In the past ~6-9 months I think the experience has improved greatly but there are still some rough edges. I've gotten into the habit of explicitly typing everything which has helped clue-in the IDE -- is there a way to turn on a warning when a type can be specified? I primarily use IntellIJ IDEA community edition and lots of
eprintln!() debugging, but I've been wanting to switch to vim + rust-analyzer (mainly from watching lots of Jon Gjengset's streams/videos!).