Apache-avro: How to decode?

The documentation of the apache-avro crate says this(Sorry about the formatting, I can't access the markup):

Reading data
As far as reading Avro encoded data goes, we can just use the schema encoded with the data to read them. The library will do it automatically for us, as it already does for the compression codec:

use apache_avro::Reader;
#
// reader creation can fail in case the input to read from is not Avro-compatible or malformed
let reader = Reader::new(&input[..]).unwrap();


In case, instead, we want to specify a different (but compatible) reader schema from the schema the data has been written with, we can just do as the following:

use apache_avro::Schema;
use apache_avro::Reader;
#

let reader_raw_schema = r#"
    {
        "type": "record",
        "name": "test",
        "fields": [
            {"name": "a", "type": "long", "default": 42},
            {"name": "b", "type": "string"},
            {"name": "c", "type": "long", "default": 43}
        ]
    }
"#;

let reader_schema = Schema::parse_str(reader_raw_schema).unwrap();

// reader creation can fail in case the input to read from is not Avro-compatible or malformed
let reader = Reader::with_schema(&reader_schema, &input[..]).unwrap();
The library will also automatically perform schema resolution while reading the data.

For more information about schema compatibility and resolution, please refer to the Avro Specification.

As usual, there are two ways to handle Avro data in Rust, as you can see below.

NOTE: The library also provides a low-level interface for decoding a single datum in Avro bytecode without markers and header (for advanced use), but we highly recommend the Reader interface to leverage all Avro features. Please read the API reference in case you are interested.

The avro way
We can just read directly instances of Value out of the Reader iterator:

use apache_avro::Reader;
#
let reader = Reader::new(&input[..]).unwrap();

// value is a Result  of an Avro Value in case the read operation fails
for value in reader {
    println!("{:?}", value.unwrap());
}

The serde way
Alternatively, we can use a Rust type implementing Deserialize and representing our schema to read the data into:

use apache_avro::Reader;
use apache_avro::from_value;

#[derive(Debug, Deserialize)]
struct Test {
    a: i64,
    b: String,
}

let reader = Reader::new(&input[..]).unwrap();

// value is a Result in case the read operation fails
for value in reader {
    println!("{:?}", from_value::<Test>(&value.unwrap()));
}

But when I try to use said code, It gives me this weird type: apache_avro::Reader<'_, &[u8]> which does not have an unwrap method.

I have tried using the from_value method. But the gives me the Value type. Which is usless.

The crates documentation seems to be the only place to find info about such things. But there is nothing about this.

This is the current code:

use lazy_static::lazy_static;
use apache_avro::*;
use apache_avro::types::Record;
use serde::Deserialize;
use serde::Serialize;

#[derive(Clone, Copy, Serialize, Deserialize, PartialEq, Eq, Debug, AvroSchema)]
pub struct X {
    x: bool,
}

lazy_static! {
	static ref SCHEMA: Schema = X::get_schema();
}



pub fn decode_avro(bytes: &[u8]) -> X {
	Reader::with_schema(&SCHEMA, bytes).unwrap() 
}

How can I actually decode it?

Can you share the code you've written, and specifically what parts you don't understand?

Why do you say Value is useless? It's an enum containing one of the kinds of data across supports.

You may also want to check that you're viewing documentation that matches the version of the crate you're using

This is the current code, I also was reading the 0.14.0 documentation:

use lazy_static::lazy_static;
use apache_avro::*;
use apache_avro::types::Record;
use serde::Deserialize;
use serde::Serialize;

#[derive(Clone, Copy, Serialize, Deserialize, PartialEq, Eq, Debug, AvroSchema)]
pub struct X {
    x: bool,
}

lazy_static! {
	static ref SCHEMA: Schema = X::get_schema();
}



pub fn decode_avro(bytes: &[u8]) -> X {
	Reader::with_schema(&SCHEMA, bytes).unwrap() 
}

Value seems to be a one way encoding into the avro format, So I can't really use it

The Reader is an Iterator which is why the docs examples can use it in a loop to get values. If you're only trying to decode a single value you can just call the Iterator::next() trait method directly to get the first value (assuming there is one)

use apache_avro::*;
use lazy_static::lazy_static;
use serde::Deserialize;
use serde::Serialize;

#[derive(Clone, Copy, Serialize, Deserialize, PartialEq, Eq, Debug, AvroSchema)]
pub struct X {
    x: bool,
}

lazy_static! {
    static ref SCHEMA: Schema = X::get_schema();
}

pub fn decode_avro(bytes: &[u8]) -> X {
    let value = Reader::with_schema(&SCHEMA, bytes)
        .expect("Failed to create reader from schema")
        // Get the first value from the Reader via the Iterator trait method
        .next()
        // Unwrap the Option next returns since we're expecting at least one value
        .expect("No values in input")
        // Unwrap the Result that the Reader provides since we expect the input to be a valid value
        .expect("Failed to parse value");

    from_value(&value)
        // Unwrap the result from deserializing with serde
        .expect("serde Deserialize failed")
}

That makes a lot more sense, I was used to parsers that didn't have such details :grinning:

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.