How to open excel file in Polars dataframe?
Asked Answered
C

4

5

I am a python pandas user but recently found about polars dataframe and it seems quite promising and blazingly fast. I am not able to find a way to open an excel file in polars. Polars is happily reading csv, json, etc. but not excel.

I am extensive user of excel files in pandas and I want to try using polars. I have many sheets in excel that pandas automatically read. How can I do same with polars?

What am I missing?

Cosetta answered 28/3, 2022 at 4:23 Comment(3)
have a look at the library's API and see if it offers a way to read excel filesImmanuel
Sorry but it looks like Polars does not have any read/write functions for the excel format. If you can, I would recommend start saving data in csv format or sql.Seduce
Yes it seems so that there is currently no option in polars to read/write excel. I am receiving my data daily in excel format in multiple sheets, converting it to csv will again consume more time. Instead i will continue with pandas till there is some solution.Cosetta
S
3

This is more of a workaround than a real answer, but you can read it into pandas and then convert it to a polars dataframe.

import polars as pl
import pandas as pd
df = pd.read_excel(...)
df_pl = pl.DataFrame(df)

You could, however, make a feature request to the Apache Arrow community to support excel files.

Shy answered 28/3, 2022 at 7:6 Comment(0)
U
6

Polars now has a read_excel method, as of this PR!

https://pola-rs.github.io/polars/py-polars/html/reference/api/polars.read_excel.html

You should be able to just do:

import polars as pl
df = pl.read_excel("file.xlsx")
Udale answered 15/8, 2022 at 1:15 Comment(0)
S
3

This is more of a workaround than a real answer, but you can read it into pandas and then convert it to a polars dataframe.

import polars as pl
import pandas as pd
df = pd.read_excel(...)
df_pl = pl.DataFrame(df)

You could, however, make a feature request to the Apache Arrow community to support excel files.

Shy answered 28/3, 2022 at 7:6 Comment(0)
R
0

I had issues to open excel file with pl.read_excel and there is a config that works for me:

import polars as pl
df_pl = pl.read_excel("YOUR_FILE_DIR", read_csv_options={"infer_schema_length": 10000})
df_pl.head()

Notice: This option is only applicable when using the xlsx2csv engine.

Doc

Rai answered 3/2, 2024 at 15:48 Comment(0)
B
0

In polars 1.0+ the default engine is "calamine" earlier it was "xlsx2csv"

Document says :

Where possible, prefer the default “calamine” engine for reading Excel Workbooks, as it is significantly faster than the other options.

“calamine”: this engine can be used for reading all major types of Excel Workbook (.xlsx, .xlsb, .xls) and is dramatically faster than the other options, using the fastexcel module to bind the Calamine parser.

You will need fastexcel to use calamine.

pip install fastexcel

then:

import polars as pl
df = pl.read_excel("file.xlsx")

https://docs.pola.rs/api/python/stable/reference/api/polars.read_excel.html

Bilek answered 6/8, 2024 at 13:56 Comment(0)

© 2022 - 2025 — McMap. All rights reserved.