I was following this answer but it didn't work for me on my version (polars = { version = "0.42.0", features = ["dtype-struct", "lazy", "polars-io"] }). I see that in new version into_struct
returns ChunckedArray<StructType>
instead of StructChuncked
. It's surprising that iter
in new version gives Option<()>
which seems useless. Does it mean that it's impossible to iterate over ChunckedArray<StructType>
? Or is there a different way of doing that? Also if you know the motivation behind that change I'd be glad to learn about that.
Rust Polars iterate ChunckedArray<StructType>
Asked Answered
[package]
name = "pol"
version = "0.1.0"
edition = "2021"
[dependencies]
polars = {version="0.43.0",features=["mode","polars-io","csv","polars-ops","lazy","docs-selection","streaming","regex","temporal","is_unique","is_between","dtype-date","dtype-datetime","dtype-time","dtype-duration","dtype-categorical","rows","is_in","pivot"]}
polars-io = "0.43.0"
polars-lazy = "0.43.0"
There was .downcast_iter() can be used,but you need operate Chunks.It's very complicated.
The essence of this problem is that Struct is not used for row iteration of dataframe. Struct is used to package complex results of custom functions into a Series in complex aggregation1. Polars row iteration has a large efficiency loss because it involves many type conversions.
df.get_row(&self, idx: usize)
can be used for row-wise working,but slow.
In fact, the Series of dataframe can be taken out and iterated directly. We need to use itertools::multizip. This is much more efficient than the built-in df.get_row function of polars.
Add itertools to your Cargo.toml:
[dependencies]
itertools = "0.13.0"
The row-wise iterator code
use polars::prelude::*;
use itertools::multizip;
#[derive(Debug)]
pub struct Person {
id: u32,
name: String,
age: u32,
}
let df = df!(
"id" => &[1u32,2,3],
"name" => &["John", "Jane", "Bobby"],
"age" => &[32u32, 28, 45]
)
.unwrap();
let objects = df.take_columns();
let id_ = objects[0].u32()?.iter();
let name_ = objects[1].str()?.iter();
let age_=objects[2].u32()?.iter();
let combined = multizip((id_, name_, age_));
let res: Vec<_>= combined.map(
|(a, b, c)|{
Person{
id:a.unwrap(),
name:b.unwrap().to_owned(),
age:c.unwrap(),
}
}).collect();
print!("{:?}",res);
© 2022 - 2025 — McMap. All rights reserved.
pub type StructChunked = ChunkedArray<StructType>;
– Suspectstruct_fields
to get the fields,fields_as_series
to get the data – Suspect