What's everyone working on this week (45/2025)?

New week, new Rust! What are you folks up to?

For a work project, I have been building a DOM-based SIMD-optimized XML parser called robinson. It is derived from roxmltree and inherits the basic approach of zero-copy parsing but is thereby limited to parsing fully-buffered XML input and needs enough memory to materialise the full DOM.

I mainly optimised the in-memory representation (e.g., replacing AoS by SoA), simplified the parsing and namespace resolution and applied SIMD intrinsics going as far as writing bespoke memchr variants needed for e.g. attribute value normalisation.[1]

After a colleague suggested it, we ran our own benchmarks using other parsers and the library did pretty good compared to both its origin roxmltree as well as the state of the art quick-xml even though this SAX-based parser does not build a DOM nor normalises text/CDATA/attribute values. Of course, as written above, its use case is more general and everyone should run their own benchmarks.[2]

Please note that I am not posting to start an XML parser competition. Rather, I do think the SIMD-enabled parsing of qualified XML names might be an interesting technique that other XML parsers might want to adopt.[3]

EDIT: I also think libraries like these are a nice motivation for portable SIMD as currently, there is a lot of boiler plate and unsafe required to use SIMD intrinsics. So having a stable, portable and safe API like std::simd would really shift the trade-offs towards writing more bespoke SIMD code IMHO.


  1. I am slowly upstreaming those SIMD optimisations which are based on using the existing memchr crate into roxmltree. ↩︎

  2. Note that we do not do CPU feature runtime detection as it does have a small runtime cost and we just build packages targeting our current server hardware at work. ↩︎

  3. For example, quick-xml has an fn name_len which could use it. But they made sure it works as a const fn so I am not sure about the trade-offs there. I certainly only started looking into qualified name parsing when it became the hottest function in a CPU profile. ↩︎

1 Like