I'm using polars and I would like to define the type of the columns while loading a dataframe. In pandas, I can use dtype
:
df=pd.read_csv("iris.csv", dtype={'petal_length':str})
I'm trying to do the same thing in polars, but without success until now. Here is what I have tried:
use polars::prelude::*;
use std::fs::File;
use std::collections::HashMap;
fn main() {
let df = example();
println!("{:?}", df.expect("Cannot find dataframe").head(Some(10)))
}
fn example() -> Result<DataFrame> {
let file = File::open("iris.csv")
.expect("could not read file");
let mut myschema = HashMap::new();
myschema.insert("sepal_length", f64);
myschema.insert("sepal_width", f64);
myschema.insert("petal_length",String);
myschema.insert("petal_width", f64);
myschema.insert("species", String);
CsvReader::new(file)
.with_schema(myschema)
.has_header(true)
.finish()
}
My doubt is what type of data the implementation with_schema
expects? I printed the schema of the DataFrame loaded using infer_schema(None)
.This prints a object that looks like a dictionary:
Schema { fields: [Field { name: "sepal_length", data_type: Float64 }, Field { name: "sepal_width", data_type: Float64 }, Field { name: "petal_length", data_type: Float64 }, Field { name: "petal_width", data_type: Float64 }, Field { name: "species", data_type: Utf8 }] }
But I cannot figure what object I should use to implement my schema.
Also, there is a way to specify the type of one variable, instead of all of them?
argument of type Vec<polars::prelude::Field> unexpected
Any idea? – Waterspout