Read cell format from Google sheet using Python (API v4)

I'm looking for a way to read a cell's formatting from a Google sheet in Python, specifically, its background colour.

Two popular packages I've found to be able to read a sheet are gspread (fork) and pygsheets. I tried both and they work well in reading my sheet data, however from what I can see neither supports reading cell formats, only setting them. This open issue on pygsheets' GitHub page describes the kind of functionality I need.

Essentially, where each row is a record that has timestamp, username, comments etc., I want to find all the rows by a particular username and only those that don't have a red background, sort of like this:

if ("Username" in record.values()) and (matching_cells.background != red):
        # Do something

Thanks!

Using google-api-python-client to obtain your authorized sheets API client, you can request the spreadsheet data with the service.spreadsheets().get method:

def get_sheet_colors(service, wbId: str, ranges: list):
    params = {'spreadsheetId': wbId,
              'ranges': ranges,
              'fields': 'sheets(data(rowData(values(effectiveFormat/backgroundColor,formattedValue)),startColumn,startRow),properties(sheetId,title))'}
    return service.spreadsheets().get(**params).execute()

desiredA1NotationRanges = ['\'Some Arbitrary Sheet 1\'!A1:K', '\'Some Other Arbitary Sheet\'!B2:D4']
all_data = get_sheet_colors(get_authed_service_somehow(), mySpreadsheetId, desiredA1NotationRanges))
# all_data is a dict with keys determined by the fields in the request
# (i.e. "sheets") and the output of the API method used (aka consult your API reference)

The code below uses the API response from above and makes two arrays, one with background colors and one with the cell values, and ensures that row & column indices are portable by prefixing rows and columns to ensure the data starts from cell A1, even if you requested a range like "C3:J5". This is provided as an example for converting the REST resource into more familiar types, and is not intended to be useful in a generic sense.

dataset = []
default_bg = {'red': 1, 'green': 1, 'blue': 1}
# all_data['sheets'] is a list of sheet resources (per the API spec.)
for sheet in all_data['sheets']:
    # The sheet resource is a dict with keys determined by what we requested in fields
    # (i.e. properties (->sheetId, ->title), data)
    print('Sheet name is {title} with grid id {sheetId}'.format_map(sheet["properties"]))
    # each range in data will only contain startRow and/or startColumn if they are not 0
    # (i.e. if you grab A1:___, you won't have startRow or startColumn)
    for range in sheet['data']:
        rowData = range.get('rowData', [])
        if not rowData:
            continue
        offsets = {'row': range.get('startRow', 0),
                   'col': range.get('startColumn', 0)}
        rangeBGs = [default_bg] * offsets['row']
        rangeValues = [''] * offsets['row']
        for row in rowData:
            colData = row['values']
            newBGs = [default_bg] * offsets['col']
            newVals = [''] * offsets['col']
            for col in colData:
                try:
                    newBGs.append(col['effectiveFormat']['backgroundColor'])
                except KeyError:
                    newBGs.append(default_bg) # Shouldn't get called (all cells have a background)
                try:
                    newVals.append(col['formattedValue']) # Always a string if present.
                except KeyError:
                    newVals.append('') # Not all cells have a value.
            rangeBGs.append(newBGs)
            rangeValues.append(newVals)
        dataset.append({'sheetId': sheet['properties']['sheetId'],
                        'sheetName': sheet['properties']['title'],
                        'backgrounds': rangeBGs,
                        'values': rangeValues})
# dataset is now a list with elements that correspond to the requested ranges,
# and contain 0-base row and column indexed arrays of the backgrounds and values.
# One could add logic to pop elements from the ranges if the entire row has no values.
# Color in A1 of 1st range:
r1 = dataset[0]
print(f'Cell A1 color is {r1["backgrounds"][0][0]} and has value {r1["values"][0][0]}')
print(f'Cell D2 color is {r1["backgrounds"][3][1]} and has value {r1["values"][3][1]}')

References:

Python Sheets API documentation
Google API Python Client
Google's Python API Explorer
The spreadsheet JSON resource
"Fields" Reference Use this syntax with the API description of the response resource / entity. (Google also has an interactive fields builder somewhere in its API Explorer documentation.)

Recommended topics

Hot tags