Hi,
I am trying to use apache-avro v.0.18.0, but I don't understand how to use it. To write data, they provider a Writer<'a, W: Write> with lifetime bound to the scheme:
My problem is, I want to repeatedly write to a single file. The options I have:
keep the writer: How do I store it in a struct with schema? That would lead to a self referential struct. I know there is a crate helping with that, but it's kind of annoying.
Create writer on every write: I tried that but ether it is overwriting the whole file, or if opened in append mode, corrupting the file probably because of multiple header writes.
There is also pub fn append_to(schema: &'a Schema, writer: W, marker: [u8; 16]) -> Self, which does not write the header, but has this marker argument, that seems to be initialized with random values.
I think the marker argument you need to pass is the sync marker that is stored in the file header. Looks like you could use apache_avro::read_marker to get it from a byte slice containing the previously written data. Maybe you could do something like first create a Writer into a Vec, immediately call .into_inner() on it, then read the marker out of it. Then write the bytes to your file and save the marker and File. Then on the next write you could use Writer::append_to with the Schema, File and marker. See also this test: avro-rs/avro/tests/append_to_existing.rs at bab254d546d6f17d65ae4d3259dbe79efaa6f456 · apache/avro-rs · GitHub
Disclaimer: I've never used Apache Avro. I'm just going off the docs and this test I found.
Thanks, that helped. It seems, that the marker is just the last written 16 bytes. I read those from the file and it seems to work.
I still do not understand the crate design choice there. Why is Schema not simply Clone and Writer has it as Owener?