Appending pandas Data Frame to Google spreadsheet
Asked Answered
M

10

19

Case: My script returns a data frame that needs has to be appended to an existing google spreadsheet as new rows of data.As of now, I'm appending a data frame as multiple single rows through gspread.

My Code:

import gspread
import pandas as pd
df = pd.DataFrame()

# After some processing a non-empty data frame has been created.

output_conn = gc.open("SheetName").worksheet("xyz")

# Here 'SheetName' is google spreadsheet and 'xyz' is sheet in the workbook

for i, row in df.iterrows():
    output_conn.append_row(row)

Is there a way to append entire data-frame rather than multiple single rows?

Monde answered 7/8, 2017 at 6:45 Comment(0)
W
22

I can recommend gspread-dataframe:

import gspread_dataframe as gd

# Connecting with `gspread` here

ws = gc.open("SheetName").worksheet("xyz")
existing = gd.get_as_dataframe(ws)
updated = existing.append(your_new_data)
gd.set_with_dataframe(ws, updated)
Windham answered 7/8, 2017 at 11:45 Comment(4)
doesn't this overwrite the "existing" data as well, making it overwhelming if we are adding, for example, 5 new rows to a 1000 row table?Foreshadow
Yes, this approach loads the existing google sheet data to your local machine, appends to it locally, and finally writes the whole shebang back to your google sheet. It looks like you COULD append by using the row parameter to set_with_dataframe, but then you better trust your column order :-)Windham
I have found a better solution - using the values_append from the gspread package. This appends values in one call without the need to replace current data in the sheet.Foreshadow
@DarkTemplar that sounds cool! If you can take the time, please add a standalone answer to it can bubble up into common knowledge :-)Windham
I
7

Here is the code to write, append(without loading the existing sheet into memory), and read to google sheets.

import gspread_dataframe as gd
import gspread as gs
gc = gs.service_account(filename="your/cred/file.json")

def export_to_sheets(worksheet_name,df,mode='r'):
    ws = gc.open("SHEET_NAME").worksheet("worksheet_name")
    if(mode=='w'):
        ws.clear()
        gd.set_with_dataframe(worksheet=ws,dataframe=df,include_index=False,include_column_header=True,resize=True)
        return True
    elif(mode=='a'):
        ws.add_rows(df.shape[0])
        gd.set_with_dataframe(worksheet=ws,dataframe=df,include_index=False,include_column_header=False,row=ws.row_count+1,resize=False)
        return True
    else:
        return gd.get_as_dataframe(worksheet=ws)
    
df = pd.DataFrame.from_records([{'a': i, 'b': i * 2} for i in range(100)])
export_to_sheets("SHEET_NAME",df,'a')

  1. Write Mode: First clear existing worksheet => ws.clear() .Second using set_with_dataframe() uploading the dataframe, here note that resize=True, which strictily set the row and col in worksheet to df.shape. This will help later in append method.
  2. Append Mode: First, add rows according to the dataframe. Second setting the parameter resize=False as we are adding rows and row=ws.row_count+1 anchoring its row value for append.
  3. Read Mode(Default): returns a dataframe
Inexactitude answered 18/8, 2020 at 15:58 Comment(3)
Interesting Share: There is this package github.com/betodealmeida/shillelagh which provides a SQL interface to Google Sheets. We can INSERT, UPDATE, SELECT rows directly from Google Sheets.Inexactitude
Hey Darsh I tried your code but during the append mode it always returns an error do you happen to know why?: gspread.exceptions.APIError: {'code': 400, 'message': 'Range (test!A999:B1001) exceeds grid limits. Max rows: 998, max columns: 26', 'status': 'INVALID_ARGUMENT'}Ideograph
Did some testing and noticed row_count returns empty rows as well. This causes the exception. I would suggest to tweak the code by using max_rows = len(sheet.get_all_values(major_dimension='rows')) instead of row=ws.row_count. This way the data will be appended at the end of the non-blank cellsIdeograph
P
4

I was facing the same problem, here's what I did converted the dataframe into list and used gspread's append_rows()

    gc = gspread.service_account(filename="credentials.json")
    sh = gc.open_by_key('<your_key>')
    ws = sh.sheet1
    
    ##data is the original data frame
    data_list = data.values.tolist()
    
    ws.append_rows(data_list)
Patricepatrich answered 24/4, 2021 at 15:55 Comment(0)
B
3

The following approach, using gspread, may help one understand the procedures and solve the problem

  1. Install the libraries in your environment.

  2. Import the libraries in the script

    import pandas as pd
    import gspread
    from gspread_dataframe import set_with_dataframe
    
  3. Create credentials in Google API console.

  4. Add the following to the script, to access the Google Sheet

    gc = gspread.service_account(filename='GoogleAPICredentials.json')
    sh = gc.open_by_key('GoogleSheetID')
    

Assuming one wants to add to the first sheet, use 0 in get_worksheet (for the second sheet use 1, and so on)

worksheet = sh.get_worksheet(0)
  1. Then, in order to export the dataframe, considering that the dataframe name is df, to a Google Sheet

    set_with_dataframe(worksheet, df)
    
Brainstorming answered 28/5, 2021 at 11:8 Comment(0)
F
2

I came up with the following solution. It does not overwrite current data but just appends entire pandas DataFrame df to the end of Sheet with name sheet in the Spreadsheet with the name spread_sheet.

import gspread
from google.auth.transport.requests import AuthorizedSession
from oauth2client.service_account import ServiceAccountCredentials

def append_df_to_gs(df, spread_sheet:str, sheet_name:str):
    scopes = [
        'https://spreadsheets.google.com/feeds',
        'https://www.googleapis.com/auth/drive',
    ]
    credentials = ServiceAccountCredentials.from_json_keyfile_name(
        path_to_credentials,
        scopes=scopes
    )
    gsc = gspread.authorize(credentials)
    sheet = gsc.open(spread_sheet)
    params = {'valueInputOption': 'USER_ENTERED'}
    body = {'values': df.values.tolist()}
    sheet.values_append(f'{sheet_name:str}!A1:G1', params, body)

For params valueInputOption please consult this. I used USER_ENTERED here as I needed some formulas to be valid once I append the data to Google Sheets.

Foreshadow answered 3/2, 2020 at 14:29 Comment(0)
M
1
ws = gc.open("sheet title").worksheet("Sheet1")

gd.set_with_dataframe(ws, dataframe)

#simply transform your dataframe to google sheet

Mountfort answered 15/9, 2020 at 10:16 Comment(0)
C
1

I came up with the following solution using try/catch statement, in case the spreadsheet doesn't exsit he will create it for you and set the dataframe otherwise he will append it.

def load_to_sheet(conn_sheet, spreadsheet_name, df):
try:
    worksheet = conn_sheet.worksheet(spreadsheet_name)
    worksheet.add_rows(df.shape[0])
    set_with_dataframe(worksheet=worksheet, row=worksheet.row_count, dataframe=df, include_index=False,
                       include_column_header=False,
                       resize=False)
except Exception:
    worksheet = conn_sheet.add_worksheet(title=spreadsheet_name, rows=100, cols=100)
    set_with_dataframe(worksheet=worksheet, dataframe=df, include_index=False, include_column_header=True,
                       resize=True)
Curitiba answered 7/10, 2022 at 9:10 Comment(0)
B
0

The following doesn't require external libs other than gspread:

worksheet.update([dataframe.columns.values.tolist()] + dataframe.values.tolist())
Backset answered 26/5, 2022 at 20:27 Comment(0)
O
0

Improving upon the solution given by Darsh Shukla

def append_df_to_sheet(g_spread_name, sheet_name, df):
    
    wks = gd.set_with_dataframe(worksheet= self.gc.open(g_spread_name).worksheet(sheet_name) # get the sheet
    append_row = wks.row_count+1 # position where to append
    wks.add_rows(df.shape[0]) # add rows to avoid API range error
    gd.set_with_dataframe(worksheet= self.gc.open(self.g_spread_name).worksheet(sheet_name), # refresh the sheet   
                dataframe=df,
                include_index=False,
                include_column_header=False,
                row = append_row,
                resize=False)
    return True

This avoids the error

gspread.exceptions.APIError: {'code': 400, 'message': 'Range (test!A999:B1001) exceeds grid limits. Max rows: 998, max columns: 26', 'status': 'INVALID_ARGUMENT'}
Overt answered 18/2, 2023 at 14:9 Comment(0)
C
-2

if Google spreadsheet takes .csv format then you can convert a pandas dataframe to csv using df.to_csv() and save it in that format

Callery answered 7/8, 2017 at 7:40 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.