Convert Polars dataframe to vector of structs
Asked Answered
T

1

0

I am making a Maturin project involving Polars on both the Python and Rust side.

In Python I have a dataframe with columns a and b:

import polars as pl
df = pl.DataFrame({'a': [1, 2], 'b': ['foo', 'bar']})

In Rust I have a struct MyStruct with the fields a and b:

struct MyStruct {
  a: i64
  b: String
}

I would like to convert each row in the dataframe to an instance of MyStruct, mapping the dataframe to a vector of MyStructs. This should be done on the Rust side.

I can get this done on the Python side (assuming MyStruct is exposed as a pyclass). First by getting a list of Python dicts and then constructing a Python list of MyStruct.

df_as_list = df.to_struct'MyStruct').to_list()
[MyStruct(**x) for x in df_as_list]

To spice things up a bit more, imagine that MyStruct has an enum field instead of a String field:

enum MyEnum {
  Foo
  Bar
}
struct MyStruct {
  a: i64
  b: MyEnum
}

With a suitable function string_to_myenum that maps strings to MyEnum (that is, "foo" to Foo and "bar" to Bar) it would be great to map the dataframe to the new MyStruct.

Teleran answered 7/3, 2024 at 8:9 Comment(2)
Note that while it is possible, it will be slow. Much better to stay in DataFrame land as long as you can.Slender
This is essentially the opposite of Creating Polars Dataframe from Vec<Struct>.Slender
S
1

Zip the columns together:

let arr: Vec<MyStruct> = df["a"]
    .i64()
    .expect("`a` column of wrong type")
    .iter()
    .zip(df["b"].str().expect("`b` column of wrong type").iter())
    .map(|(a, b)| {
        Some(MyStruct {
            a: a?,
            b: b?.to_owned(),
        })
    })
    .collect::<Option<Vec<_>>>()
    .expect("found unexpected null");

Note, however, that like I said in the comments, this will be slow, especially for large DataFrames. Prefer to do things using the Polars APIs where possible.

Slender answered 6/7, 2024 at 21:24 Comment(1)
You are absolutely right about this being slow, but I have to make the conversion at some point b/c I'm getting tabular data in Python and I'm using Rust code that is working on a vector of this particular struct.Teleran

© 2022 - 2025 — McMap. All rights reserved.