Rust program run on top of Hadoop in cluster

Is there any way to write code in rust which will need to run top of Hadoop in cluster? If yes please brief with detail.

I'm not sure I understand what you want to achieve, could you please be a little more specific? Do you want to execute some Rust code with Hadoop YARN? If so, you can schedule Docker containers.

I've also found the efflux crate which is supposed to offer a Rust interface to Hadoop streaming and MapReduce.

If you want to work with Parquet files hosted on your Hadoop cluster, there are the arrow2 and polars crates.

Yes, I would like to run a Rust program in Hadoop YARN. The program needs to read a CSV file from HDFS, perform some operations on it, compress the result into a zip format, and finally upload it to Amazon S3.

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.