How can we load parquet data from memory for DataFusion?

I have an example code of read data from parquet file.

I found example for read data from local disk file, but i want load data from memory, because my data already load in memory by other method.

How can i load file data from memory?

like bellow code, i have file_data in memory, but i don't know how to use it for data fusion.

Thank you.

src/main.rs

use arrow::record_batch::RecordBatch;
use datafusion::{
    error::DataFusionError,
    prelude::{ParquetReadOptions, SessionContext},
};

#[tokio::main]
async fn main() -> Result<(), DataFusionError> {
    let ctx = SessionContext::new();

    // load file data
    let file_data = std::fs::read("data/example.parquet").unwrap();
    println!("{:?}", file_data.len());

    // How can we use load data from memory(file_data) to datafusion?
    // TODO!
    // ctx.register_parquet_from("foo", file_data, ParquetReadOptions::default()).await?;

    // create the dataframe
    ctx.register_parquet(
        "foo",
        "data/example.parquet",
        ParquetReadOptions::default(),
    )
    .await?;

    // create a plan
    let df = ctx.sql("SELECT count(*) FROM foo").await?;

    // execute the plan
    let results: Vec<RecordBatch> = df.collect().await?;

    // format the results
    let pretty_results = arrow::util::pretty::pretty_format_batches(&results)?.to_string();

    println!("{}", pretty_results);

    Ok(())
}

Cargo.toml

[package]
name = "datafusion"
version = "0.1.0"
edition = "2021"

# See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html

[dependencies]
arrow = { version = "20.0.0", features = ["prettyprint"] }
datafusion = "11.0"
tokio = "1.0"

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.