I am trying to deserialize big XML files. I am using quick-xml because it seems the most popular option.
But I can find a way to easily stream and deserialize at the same time.
The data structure I am trying to deserialize is quite simple, it is a Vec<Thingy>
. So the file looks like:
<bla>
<thingy attr1=... attr2=... ... />
<thingy attr1=... attr2=... ... />
<thingy attr1=... attr2=... ... />
...
</bla>
If I use quick_xml::de::from_reader
to deserialize I run out of memory. The final purpose is to fill up an sql database so I thought I could just stream the XML into thingies and for each Thingy, insert it into the database.
But this turns out to be more complicated than I thought. In order to stream something, it seems that I must use a API like read_event_into
which seems way lower that I am willing to go. It would require way more work (and trouble, and maintenance, and bugs, etc) than I would like.
I was hoping that I could just get the string "<thingy attr1=... attr2=... ... />" from the event and then just call quick_xml::de::from_str
on it. Seems a little ugly but I can live with that. Yet, if I understand the doc correctly, that does not seems to be an option.
Is there a way with quick_xml or another reasonably mature crate to stream Thingies easily here? Basically a SAX but not at the node level, but at the "object you are trying to deserialize" level.