Creating Polars Dataframe from Vec<Struct>
Asked Answered
H

2

8

Supposing I have a vector of structs like so:


struct Test {
    id:u32,
    amount:u32
 }
 
 fn main() {
    let test_vec:Vec<Test> = vec![Test{id:1,amount:3}, Test{id:3,amount:4}];
 }

Is there a way to get this into a polars dataframe with the column names being the struct fields?

Hoping to get an output as follows:

   id  amount
0   1       3
1   3       4
Henrie answered 29/7, 2022 at 13:19 Comment(3)
A dataframe is organized by column, not by row. It looks to me like you'll have to create the dataframe from series manually.Monk
Thanks for the advice, gave it a go but found it to verbose and settled on the below solution!Henrie
For the opposite question, see Convert Polars dataframe to vector of structs.Keven
H
6

After a lot of head banging, I found the following solution.

If you have a vector of a custom struct, to get it into a Polars dataframe you can do the following:

// 1. Derive serde::Serialize for your struct

#[derive(Serialize)]
struct Test {
    id:u32,
    amount:u32
}

// (Adding new method here for quality of life).

impl Test {
    fn new(id:u32, amount:u32) -> Self{
        Test{id,amount}
    }
}


// 2. Jsonify your struct Vec
let test_vec:Vec<Test> = vec![Test::new(1,3), Test::new(3,4)];
let json = serde_json::to_string(&test_vec).unwrap();

// 3. Create cursor from json 
let cursor = Cursor::new(json);

// 4. Create polars DataFrame from reading cursor as json
let df = JsonReader::new(cursor)
            .finish()
            .unwrap();
    
Henrie answered 2/8, 2022 at 15:34 Comment(0)
P
12

I dislike the accepted answer for a couple of reasons

  1. It is type unsafe, and in fact you loose type information if you convert for example a chrono::NaiveDate field - it will come back as a str in your DataFrame.
  2. It is inefficient, since you need to serialize and deserialize your data.

I think a much better solution is a macro:

macro_rules! struct_to_dataframe {
    ($input:expr, [$($field:ident),+]) => {
        {
            let len = $input.len().to_owned();

            // Extract the field values into separate vectors
            $(let mut $field = Vec::with_capacity(len);)*

            for e in $input.into_iter() {
                $($field.push(e.$field);)*
            }
            df! {
                $(stringify!($field) => $field,)*
            }
        }
    };
}

You should be able to call it like so:

struct Test {
    id:u32,
    amount:u32
}

impl Test {
    fn new(id:u32, amount:u32) -> Self{
        Test{id,amount}
    }
}
let test_vec:Vec<Test> = vec![Test::new(1,3), Test::new(3,4)];
let df = struct_to_dataframe!(test_vec, [id, amount]).unwrap();
Pleuron answered 2/6, 2023 at 21:30 Comment(1)
This is great, thank you! I have tweaked the macro to use with_capacity(), for much more efficient creation of struct vectors with many elements.Inchmeal
H
6

After a lot of head banging, I found the following solution.

If you have a vector of a custom struct, to get it into a Polars dataframe you can do the following:

// 1. Derive serde::Serialize for your struct

#[derive(Serialize)]
struct Test {
    id:u32,
    amount:u32
}

// (Adding new method here for quality of life).

impl Test {
    fn new(id:u32, amount:u32) -> Self{
        Test{id,amount}
    }
}


// 2. Jsonify your struct Vec
let test_vec:Vec<Test> = vec![Test::new(1,3), Test::new(3,4)];
let json = serde_json::to_string(&test_vec).unwrap();

// 3. Create cursor from json 
let cursor = Cursor::new(json);

// 4. Create polars DataFrame from reading cursor as json
let df = JsonReader::new(cursor)
            .finish()
            .unwrap();
    
Henrie answered 2/8, 2022 at 15:34 Comment(0)

© 2022 - 2025 — McMap. All rights reserved.