Deserialize python's numpy objects from pickle files

Being pretty new to Rust, what I am trying to do may not be right.
I work with serialized python object stored in pickle file format, these files contain serialized numpy ndarray. I would want to deserialize them into rust's Vec.

From I found, deserializing pickle files is easy with serde-pickle crate, same for numpy msgpack file format deserializing with msgpack-numpy crate.
But I found nothing to deserialize numpy ndarray stored into pickle files...

What could be an approach to do that without first running python code to convert numpy arrays into python lists ?

I thank you a lot :slight_smile:

Write a Python program to do it.

You're correct to note that crates exist for dealing with a subset of the pickle format, but pickle is idiosyncratic to the Python language in a lot of ways. There are plenty of valid pickle streams that serde-pickle is unable to process. More generally, an implementation of pickle in another language - Rust included - is going to have problems one way or another because of differences between the host language's rules and Python's. Missing capabilities are probably the most minor example of that, but it's enough to bring you to a crawl.

While it would certainly be possible to implement a Rust parser that can solve this specific issue, your time is almost certainly better spent writing a program in Python to convert pickled data to a more portable format. Depending on the size and structure of your numpy ndarray, you might be able to get away with a CSV file, or you might be better suited using something like sqlite or some portable format standard designed for large numerical arrays.

1 Like