In polars I want to map a struct
column into an actual Rust struct
, but doing that requires handling the struct
column as a vector of impl Scalar that are only capable of casting to any
and then rely on you to downcast_ref
to the concrete type.
But the problem is the concrete types are not obvious, so I have to successfully guessed the proper type to use for downcast_ref<T>
.
How can I know the type? Essentially I'm in situation where I only know the following:
The String
fields have:
TypeId: (16029866226928567356, 7151959947073222850)
ArrowDataType: Utf8View
Debug formatted sample: Scalar(Some("rank"))
The integer fields have:
TypeId: (13743848030338613455, 2905191465459971113)
ArrowDataType: Int32
Debug formatted sample: PrimitiveScalar { value: Some(4), data_type: Int32 }
So given only these pieces of information, how can know what type to use to downcast_ref
correctly?
I've been guessing the type, but it always returns None
...
Full Example:
use polars::prelude::*;
let outcome = df!(
"a" => ["a", "b"],
"b" => [1i32, 2i32],
).unwrap().lazy().select([as_struct(vec![all()]).alias("info")]).collect().unwrap()
.column("info")?
.struct_()?.clone()
.into_series().iter()
.map(|s| match s {
AnyValue::Struct(length, entries, field) => {
println!("field: {field:?}");
let q = entries.into_iter().map(|n| match n {
Some(inner) => inner.into_iter().map(|scal| {
let s_any = scal.as_any();
let t = s_any.type_id();
let na = "NA".to_owned();
let guess = s_any.to_owned().downcast_ref::<String>().unwrap_or(&na);
let dt = scal.data_type();
println!(" casted: {guess:?} datatype: {dt:?} scalar: {scal:?}");
guess.to_owned()
}).collect::<Vec<_>>(),
_ => panic!("")
}).collect::<Vec<_>>();
q
},
a => {
panic!("")
}
})
.collect::<Vec<_>>();
println!("outcome in rust: {outcome:?}");
console output:
field: [Field { name: "a", dtype: String }, Field { name: "b", dtype: Int32 }]
casted: "NA" datatype: Utf8View scalar: Scalar(Some("a"))
casted: "NA" datatype: Int32 scalar: PrimitiveScalar { value: Some(1), data_type: Int32 }
casted: "NA" datatype: Utf8View scalar: Scalar(Some("b"))
casted: "NA" datatype: Int32 scalar: PrimitiveScalar { value: Some(2), data_type: Int32 }
field: [Field { name: "a", dtype: String }, Field { name: "b", dtype: Int32 }]
casted: "NA" datatype: Utf8View scalar: Scalar(Some("a"))
casted: "NA" datatype: Int32 scalar: PrimitiveScalar { value: Some(1), data_type: Int32 }
casted: "NA" datatype: Utf8View scalar: Scalar(Some("b"))
casted: "NA" datatype: Int32 scalar: PrimitiveScalar { value: Some(2), data_type: Int32 }
outcome in rust: [[["NA", "NA"], ["NA", "NA"]], [["NA", "NA"], ["NA", "NA"]]]