Fixing Polars 0.37 compilation errors: Hashbrown 0.17 dependency conflict and decimal parsing

I recently worked on stabilizing a data pipeline using Polars 0.37 in a Windows environment. During the process, I encountered a series of critical versioning conflicts and API changes that might be of interest to the community and library maintainers.

1. The Dependency Conflict (The "Hashbrown" Break)

  • Problem: Compilation failed with the error: failed to resolve: use of undeclared type raw_table_mut.
  • Root Cause: A major version bump in hashbrown (from 0.14 to 0.17) introduced breaking changes. Indirect dependencies (like indexmap) were pulling the newer version, while polars-core (v0.37) still relied on the 0.14 internal API.
  • Solution: Strict version pinning in Cargo.toml.Ini, TOML[dependencies] polars = { version = "0.37.0", features = ["lazy", "csv", "strings"] } indexmap = "=2.2.6" hashbrown = "=0.14.5"

2. Memory Safety & API Surface (Arc<Schema>)

  • Problem: mismatched types: expected &Schema, found Arc<Schema>.
  • Root Cause: In Polars 0.37, the signature for with_dtype_overwrite is quite specific regarding ownership. It requires a reference to the Option containing the Arc, rather than taking ownership of the Arc itself.
  • Solution: Adjusting the implementation to use proper borrowing: Some(&schema_ref).

3. Regional Format Parsing (Comma as Decimal)

  • Problem: ComputeError: could not parse '1,27' as dtype f64.
  • Root Cause: The input CSV used regional formatting (commas for decimals). Standard float parsing in Rust/Polars defaults to scientific notation (dots).
  • Solution: Instead of relying on the LazyCsvReader auto-inference, I implemented a Lazy Transformation Pipeline:
    1. Loaded numeric columns as DataType::String.
    2. Used the str().replace() method with the literal: true flag.
    3. Cast the result to Float64.

4. API Arity Change in .str().replace()

  • Observation: The replace method in the string namespace for this version requires 3 arguments: pattern, value, and a explicit bool for literal matching.
  • Code fix: .str().replace(lit(","), lit("."), true).

Conclusion: While the "Dependency Hell" was the most frustrating part, it highlights the fragility of transitive dependencies in large ecosystems like Polars. I suggest developers keep a close eye on hashbrown versioning and perhaps consider more robust decimal handling for regional datasets.

Hopefully, this documentation helps others facing similar compilation issues on older stable versions.

You shouldn't need to pin exact versions. hashbrown = "0.14" should be enough for your own code. And indexmap afaik doesn't expose hashbrown at all, so the fact that it updated to hashbrown 0.17 won't affect you at all. Cargo will build both hashbrown 0.14 and hashbrown 0.17 as both versions are considered semver incompatible.

This has nothing to do with memory safety at all.

CsvParseOptions has a decimal_comma field you can use.