How can we load parquet data from memory for DataFusion?

I have an example code of read data from parquet file.

I found example for read data from local disk file, but i want load data from memory, because my data already load in memory by other method.

How can i load file data from memory?

like bellow code, i have file_data in memory, but i don't know how to use it for data fusion.

Thank you.


use arrow::record_batch::RecordBatch;
use datafusion::{
    prelude::{ParquetReadOptions, SessionContext},

async fn main() -> Result<(), DataFusionError> {
    let ctx = SessionContext::new();

    // load file data
    let file_data = std::fs::read("data/example.parquet").unwrap();
    println!("{:?}", file_data.len());

    // How can we use load data from memory(file_data) to datafusion?
    // TODO!
    // ctx.register_parquet_from("foo", file_data, ParquetReadOptions::default()).await?;

    // create the dataframe

    // create a plan
    let df = ctx.sql("SELECT count(*) FROM foo").await?;

    // execute the plan
    let results: Vec<RecordBatch> = df.collect().await?;

    // format the results
    let pretty_results = arrow::util::pretty::pretty_format_batches(&results)?.to_string();

    println!("{}", pretty_results);



name = "datafusion"
version = "0.1.0"
edition = "2021"

# See more keys and their definitions at

arrow = { version = "20.0.0", features = ["prettyprint"] }
datafusion = "11.0"
tokio = "1.0"

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.