Does Polars module not have a method for appending DataFrames to output files?
Asked Answered
O

1

6

When writing a DataFrame to a csv file, I would like to append to the file, instead of overwriting it.

While pandas DataFrame has the .to_csv() method with the mode parameter available, thus allowing to append the DataFrame to a file, None of the Polars DataFrame write methods seem to have that parameter.

Offering answered 29/1, 2023 at 6:4 Comment(10)
Similar post regarding Parquet files here. From the documentation, it looks like Polars does not support appending to CSV files either.Cascabel
@adamius Yes, I saw the the post about the Parquets. Wanted to be sure that Polars does not have support appending to files as a DataFrame method. Thanks.Offering
You can pass a filehandle/Path obj e.g. with open("out.csv", mode="ab") as f: df.write_csv(f, has_header=False)Abijah
@Abijah that didn't work. Because df.write_csv() expects a string and not a byte object.Offering
It works for me - did you use mode="ab" exactly? The b is also required.Abijah
Please provide enough code so others can better understand or reproduce the problem.Ringworm
Mr. @Abijah that indeed worked for me! It's a nice workaround. Now I'm left with the intrigue of why Polars does not provide that capabilityOffering
@Mr.Caribbean I would guess that it's some combination of (in no particular order) no one asking for it, it being trivial to get the functionality with base python, and there being higher priority functionality to work on.Statolatry
@DeanMacGregor yes. Maybe keep it practical and focus on lazy, parallel, memory management and speed.Offering
Does this work with any file format? Or just CSV?Cusk
A
8

To append to a CSV file for example - you can pass a file object e.g.

import polars as pl

df1 = pl.DataFrame({"a": [1, 2], "b": [3 ,4]})
df2 = pl.DataFrame({"a": [5, 6], "b": [7 ,8]})

with open("out.csv", mode="a") as f:
   df1.write_csv(f)
   df2.write_csv(f, include_header=False)
>>> from pathlib import Path
>>> print(Path("out.csv").read_text(), end="")
a,b
1,3
2,4
5,7
6,8
Abijah answered 29/1, 2023 at 6:48 Comment(1)
I have three suggestions to improve your answer: (1) You should make sure that out.csv does not exist before running the example, (2) since CSV files are text files I'd use mode="a", and (3) has_header has been renamed to include_header.Jiggermast

© 2022 - 2024 — McMap. All rights reserved.