JSON base64 String deserialization to Array3

Hi,

I am a very beginner in Rust (coming from python) and while I can see how clean and robust rust requires the code to be I still have trouble to play with some of the basic concept.

I want to "transfer" an array from python (a 3D array using numpy with a shape (14, 484, 636)) to a Array3 in rust to do some manipulation onto it. It has been suggested to me to create a JSON file in python containing the information encoded in base64 and reopen and reconstruct my array in Rust.

I have been able to serialize my data (an image array of shape (25, 25) for a start) in a JSON file in python (I put the code down here for clarity/suggestions):

import json
import base64

base64_study = base64.b64encode(arr)
base64_string = base64_study.decode("utf-8")

json_dict = {
    "height": arr.shape[1],
    "width": arr.shape[2],
    "nb_frames": arr.shape[0],
    "image_data": base64_string,
}

with open("dummy_data.json", "w") as outfile:
    json.dump(json_dict, outfile)

But then when I try to deserialize my data in Rust:

use base64;
use serde::{Deserialize, Serialize};
use serde_json::{from_reader, from_str};

#[derive(Debug, Deserialize, Serialize)]
struct Study {
    height: u16,
    width: u16,
    image_data: String,
    nb_frames: u16
}
fn main() {
    let file = File::open("dummy_data.json").unwrap();
    let study: Study = serde_json::from_reader(&file).expect("JSON was not well-formatted");
    let img_data: Vec<u8> = base64::decode(study.image_data).unwrap();
    println!("{:?}", img_data.len());
}
>> cargo run
>> 5000

But I get an immense array with way more 0s than expected (with a length 5000 instead of 625) ! I am not too sure what's happening.

Then the next question is to know how to reshape the current flatten array to the original shape (25 x 25).

Thank you very much for any help you can provide !

5000 = 625 * 8. You are deserializing to raw bytes, but apparently your image data is stored in a different format by the Python code, 8 bytes per value. It's probably either 64-bit integers or 64-bit floating-point numbers – check the exact type in Python before serializing.

1 Like

You were right ! I looked it up and the solution was to make sure that the array I was encoding in base64 was in uint8 such as:

arr = arr.astype(np.uint8)

Regarding the rebuilding of the array in Rust I wrote this (in case someone tumble into here) for a 2D array :

use ndarray::{Array2};

fn base64_to_array2_u8(data: Vec<u8>, height: usize, width: usize) -> Array2<u8> {
    let mut arr = Array2::<u8>::zeros((height, width));
    let mut idx: usize = 0;
    for i in 0..height {
        for j in 0..width {
            arr[[i, j]] = data[idx];
            idx += 1;
        }
    }
    arr
}

for a 3D array:

use ndarray::{Array3};

pub fn base64_to_array3_u8(data: Vec<u8>, frame: usize, height: usize, width: usize) -> Array3<u8> {
    let mut arr = Array3::<u8>::zeros((frame, height, width));
    let mut idx: usize = 0;
    for f in 0..frame {
        for i in 0..height {
            for j in 0..width {
                arr[[f, i, j]] = data[idx];
                idx += 1;
            }
        }
    }
    arr
}

If anyone has a suggestion to make this code better I am happy to learn and improve ! I wonder if it's possible to have 1 function for multiple types ?

Regards,

Like so?

1 Like

Wow ! Fantastic ! This is wonderfull, thank you so much it gave me a great insight on how to use traits and impl both which I am not familiar with at all ! This is also a much more advanced way to use map instead of nested loops ! I have must to learn from this example !

Thank you so much for all this valuable information !

1 Like

You should really just use from_shape_vec() instead of looping manually.

3 Likes

Thanks, I suspected such a thing existed and looked for it, but missed it.

Updated playground.

1 Like

Me too ! I didn't see it, I'll upgrade my code using it ! Thank you very much for that !

With this traits I cannot call .shape() on the array create from base64. Am I missing something given that the output should be Array2 or Array3 hence encompassing the .shape() from the ndarray crate ?

Such as:

let nb_frames: int = 14;
let height: int = 636;
let width: int = 484;
let img_arr = img64.base64_to_array([nb_frames, height, width]);
println!("IMG ARR SHAPE: {:?}", img_arr.shape())

Is there something that I am not understanding correctly ?

Here is the code I used credit to quinedot and H2CO3:

    type Array;
    fn base64_to_array(self, dimensions: [usize; N]) -> Result<Self::Array, ShapeError>;
}

impl<T> BaseToArray<2> for Vec<T> {
    type Array = Array2<T>;
    fn base64_to_array(self, dimensions: [usize; 2]) -> Result<Self::Array, ShapeError> {
        let [height, width] = dimensions;
        Self::Array::from_shape_vec((height, width), self)
    }
}

impl<T> BaseToArray<3> for Vec<T> {
    type Array = Array3<T>;
    fn base64_to_array(self, dimensions: [usize; 3]) -> Result<Self::Array, ShapeError> {
        let [frame, height, width] = dimensions;
        Self::Array::from_shape_vec((frame, height, width), self)
    }
}

You should be getting the error like this:

error[E0599]: no method named `shape` found for enum `Result` in the current scope
  --> src/main.rs:28:16
   |
28 |     let _ = a3.shape();
   |                ^^^^^ method not found in `Result<ArrayBase<OwnedRepr<{integer}>, Dim<[usize; 3]>>, ShapeError>`

The reason is in the error - you're not handling the case when conversion fails (i.e. when the vector size doesn't correspond to the dimensions).

1 Like

I am not too sure to understand because following the error suggestion rustc --explain E0599 it mentions that I should implement the shape() method rather than error handling:

This error occurs when a method is used on a type which doesn't implement it:

Erroneous code example:

struct Mouth;

let x = Mouth;
x.chocolate(); // error: no method named `chocolate` found for type `Mouth`
               //        in the current scope

In this case, you need to implement the `chocolate` method to fix the error:

struct Mouth;

impl Mouth {
    fn chocolate(&self) { // We implement the `chocolate` method here.
        println!("Hmmm! I love chocolate!");
    }
}

let x = Mouth;
x.chocolate(); // ok!

I am not too sure to know what should I do ?

"Error handling" means getting the wrapped value out of the result if it's an Ok, or doing something with the error if it's an Err. The shape method doesn't exist on Result, only on its wrapped type (Array).

This interpretation would be correct, if the type in question was what you expect. But it's not - you say you're expecting Array3, but Rust can't find the method on Result.

Thank you very much for taking the time to answer but I really do not understand what it means.

Following H2CO3 comments I tried this (following this from the documentation):

pub trait BaseToArray<const N: usize>: Sized {
    type Array;
    fn base64_to_array(self, dimensions: [usize; N]) -> Result<Self::Array, ShapeError>;
}

impl<T> BaseToArray<2> for Vec<T> {
    type Array = Array2<T>;
    fn base64_to_array(self, dimensions: [usize; 2]) -> Result<Self::Array, ShapeError> {
        let [height, width] = dimensions;
        let arr = Self::Array::from_shape_vec((height, width), self);
        let arr = match arr {
             Ok(arr) => arr,
             Err(ShapeError) => panic!("Cannot handle data type: {:?}", ShapeError)
        };
    }
}

But it's not working either, there is something that I do not grasp:

  1. The enum Results (if I understand correctly is Results<Self::Array, ShapeError> in this case) asks for a Self::Array if there is no issue during compiling and raises a ShapeError otherwise.
  2. arr is a Shape:Array with the appropriate
  3. I use match to define how error handling is made to check on arr

but it doesn't work, I don't understand what's the problem here ?

Rust is statically-typed language, and every type must match. Therefore, if base64_to_array is specified to return Result, you have to:

  • really return Result and not Array (as you were already doing),
  • treat it as a Result at the call-site (as you were not doing - you tried to treat it as an Array).

In this specific case, you have two ways to go:

  • bad: change the trait to return Self::Array and panic in the implementation;
  • good: move the match arr { ... } from base64_to_array to the place where it's called.
1 Like

Fantastic thank you very much ! This small take actually taught me a lot of rust ! I really appreciate all the help and guidance that you guys provided me ! Here is the final code that I came up with:

In loading.rs:

pub trait BaseToArray<const N: usize>: Sized {
    type Array;
    fn base64_to_array(self, dimensions: [usize; N]) -> Result<Self::Array, ShapeError>;
}

impl<T> BaseToArray<2> for Vec<T> {
    type Array = Array2<T>;
    fn base64_to_array(self, dimensions: [usize; 2]) -> Result<Self::Array, ShapeError> {
        let [height, width] = dimensions;
        let arr = Self::Array::from_shape_vec((height, width), self);
    }
}

In main.rs:

struct Study {
    height: usize,
    width: usize,
    image_data: String,
    nb_frames: usize,
    pred_coords: String,
}

fn main() {
    // Loading json file
    let file = File::open("../../Dummy/json_files/study.json").unwrap();
    let study: Study = serde_json::from_reader(&file).expect("JSON was not well-formatted");

    //Extract image array from json file
    let img64: Vec<u8> = base64::decode(study.image_data).unwrap();
    let img_arr = img64.base64_to_array([study.nb_frames, study.height, study.width]);
    let img_arr = match img_arr {
        Ok(img_arr) => img_arr,
        Err(error) => panic!("Problem opening the file: {:?}", error),
    };

    println!("IMG ARRAY SHAPE {:?}", img_arr.shape());

And it works perfectly !

Thank you for all your kind help again ! If you have anymore suggestions, tips to improve more this piece of code I'd be really happy to learn more ! Regardless, thank you very kindly !