Polars null values

I'm wondering where the best place to go is to ask questions about the polars crate. I have the following questions:

  1. How can I handle invalid UTF-8 characters when reading .csv files into a polars DataFrame?
  2. I would like to replace all null values with 0 or 0.0 in columns that are i64/f64. I can get the data type of each field after reading a file and I see there is a method to check if the values are null, but I'm wondering if someone can point me in the right direction as to how to implement this.

Thanks!

Hi mthelm.

  1. You can use CsvEncoding::LossyUtf8 to deal with invalid utf8 chars. Invalid characters will be replaced.
  2. See example. (I assumed you are using polars eager)
use polars::prelude::*;

fn main() -> Result<()> {
    let mut df = CsvReader::from_path("some_file.csv")?
        // replace non utf8 values with �
        .with_encoding(CsvEncoding::LossyUtf8)
        .finish()?;

    // get a hold on the data types
    let dtypes = df.dtypes();

    // pattern match the datatypes and fill missing values with zero
    for (i, dt) in dtypes.iter().enumerate() {
        if let DataType::Int64 | DataType::Float64 = dt {
            df.may_apply_at_idx(i, |s| s.fill_none(FillNoneStrategy::Zero))?;
        }
    }

    Ok(())
}
1 Like

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.