Pandas - save multiple CSV in a zip archive
Asked Answered
G

2

5

I need to save multiple dataframes in CSV, all in a same zip file. Is it possible without making temporary files?

I tried using zipfile:

with zipfile.ZipFile("archive.zip", "w") as zf:
    with zf.open(f"file1.csv", "w") as buffer:
        data_frame.to_csv(buffer, mode="wb")

This works with to_excel but fails with to_csv as as zipfiles expects binary data and to_csv writes a string, despite the mode="wb" parameter:

.../lib/python3.8/site-packages/pandas/io/formats/csvs.py", line 283, in _save_header

writer.writerow(encoded_labels)

.../lib/python3.8/zipfile.py", line 1137, in write

TypeError: a bytes-like object is required, not 'str'

On the other hand, I tried using the compression parameter of to_csv, but the archive is overwritten, and only the last dataframe remains in the final archive.

If no other way, I'll use temporary files, but I was wondering if someone have an idea to allow to_csv and zipfile work together.

Thanks in advance!

Goulden answered 19/4, 2021 at 9:34 Comment(0)
M
4

I would approach this following way

import io
import pandas as pd
df = pd.DataFrame({"x":[1,2,3]})
string_io = io.StringIO()
df.to_csv(string_io)
string_io.seek(0)
df_bytes = string_io.read().encode('utf-8')

as df_bytes is bytes it should now work with zipfile. Edit: after looking into to_csv help I found simpler way, to get bytes namely:

import pandas as pd
df = pd.DataFrame({"x":[1,2,3]})
df_bytes = df.to_csv().encode('utf-8')
Mayers answered 19/4, 2021 at 9:42 Comment(1)
@EinoGourdin I found simpler way after consulting help of to_csv and edited my answer accordinglyMayers
T
3

For saving multiple excel files from dataframe in a zip file

import zipfile
from zipfile import ZipFile
import pandas as pd

df1 = pd.DataFrame({"x":[1,2,3]})
df2 = pd.DataFrame({"y":[4,5,6]})
df3 = pd.DataFrame({"z":[7,8,9]})

with zipfile.ZipFile("rishabh.zip", "w") as zf:
      with zf.open(f"check1.xlsx", "w") as buffer:
         df1.to_excel(buffer,index=False)
      with zf.open(f"check2.xlsx", "w") as buffer:
         df2.to_excel(buffer,index=False)
      with zf.open(f"check3.xlsx", "w") as buffer:
         df3.to_excel(buffer, index=False)
Tilford answered 10/10, 2021 at 12:25 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.