python xlsxwriter extract value from cell
Asked Answered
W

2

6

Is it possible to extract data that I've written to a xlsxwriter.worksheet?

import xlsxwriter

output = "test.xlsx"
workbook = xlsxwriter.Workbook(output)
worksheet = workbook.add_worksheet()


worksheet.write(0, 0, 'top left')
if conditional:
    worksheet.write(1, 1, 'bottom right')

for row in range(2):
  for col in range(2):
    # Now how can I check if a value was written at this coordinate?
    # something like worksheet.get_value_at_row_col(row, col)

workbook.close()
Wad answered 12/7, 2020 at 18:46 Comment(0)
B
3

Is it possible to extract data that I've written to a xlsxwriter.worksheet?

No. XlsxWriter is write only. If you need to keep track of your data you will need to do it in your own code, outside of XlsxWriter.

Borg answered 12/7, 2020 at 18:59 Comment(1)
While I agree that keeping track of your data in your code is easiest, I think the replies with the creator of xlsxwriter in @Basement answer to be worthwhile alongside this.Ida
B
10

Is it possible to extract data that I've written to a xlsxwriter.worksheet?

Yes. Even though XlsxWriter is write only, it stores the table values in an internal structure and only writes them to file when workbook.close() is executed.

Every Worksheet has a table attribute. It is a dictionary, containing entries for all populated rows (row numbers starting at 0 are the keys). These entries are again dictionaries, containing entries for all populated cells within the row (column numbers starting at 0 are the keys).

Therefore, table[row][col] will give you the entry at the desired position (but only in case there is an entry, it will fail otherwise).

Note that these entries are still not the text, number or formula you are looking for, but named tuples, which also contain the cell format. You can type check the entries and extract the contents depending on their nature. Here are the possible outcomes of type(entry) and the fields of the named tuples that are accessible:

  • xlsxwriter.worksheet.cell_string_tuple: string, format
  • xlsxwriter.worksheet.cell_number_tuple: number, format
  • xlsxwriter.worksheet.cell_blank_tuple: format
  • xlsxwriter.worksheet.cell_boolean_tuple: boolean, format
  • xlsxwriter.worksheet.cell_formula_tuple: formula, format, value
  • xlsxwriter.worksheet.cell_arformula_tuple: formula, format, value, range

For numbers, booleans, and formulae, the contents can be accessed by reading the respective field of the named tuple.

For array formulae, the contents are only present in the upper left cell of the output range, while the rest of the cells are represented by number entries with 0 value.

For strings, the situation is more complicated, since Excel's storage concept has a shared string table, while the individual cell entries only point to an index of this table. The shared string table can be accessed as the str_table.string_table attribute of the worksheet. It is a dictionary, where the keys are strings and the values are the associated indices. In order to access the strings by index, you can generate a sorted list from the dictionary as follows:

shared_strings = sorted(worksheet.str_table.string_table, key=worksheet.str_table.string_table.get)

I expanded your example from above to include all the explained features. It now looks like this:

import xlsxwriter

output = "test.xlsx"
workbook = xlsxwriter.Workbook(output)
worksheet = workbook.add_worksheet()

worksheet.write(0, 0, 'top left')
worksheet.write(0, 1, 42)
worksheet.write(0, 2, None)
worksheet.write(2, 1, True)
worksheet.write(2, 2, '=SUM(X5:Y7)')
worksheet.write_array_formula(2,3,3,4, '{=TREND(X5:X7,Y5:Y7)}')
worksheet.write(4,0, 'more text')
worksheet.write(4,1, 'even more text')
worksheet.write(4,2, 'more text')
worksheet.write(4,3, 'more text')

for row in range(5):
  row_dict = worksheet.table.get(row, None)
  for col in range(5):
    if row_dict != None:
      col_entry = row_dict.get(col, None)
    else:
      col_entry = None
    print(row,col,col_entry)

shared_strings = sorted(worksheet.str_table.string_table, key=worksheet.str_table.string_table.get)

print()
if type(worksheet.table[0][0]) == xlsxwriter.worksheet.cell_string_tuple:
  print(shared_strings[worksheet.table[0][0].string])

# type checking omitted for the rest...
print(worksheet.table[0][1].number)
print(bool(worksheet.table[2][1].boolean))
print('='+worksheet.table[2][2].formula)
print('{='+worksheet.table[2][3].formula+'}')

workbook.close()
Basement answered 4/8, 2020 at 16:37 Comment(3)
I am the author of XlsxWriter. This is a nice bit of reverse engineering or proof of concept but I would advise against doing this in practice. There is no guarantee, either explicit or implicit, that data is, or will be, maintained once it crosses the XlsxWriter API. The best way to handle use cases like this is to track the data outside the library. Python has a rich toolkit to handle data that is better than relying on internals of a library that wasn't written for the purpose of tracking data.Borg
Let me say thank you for such a handy python module, XlsxWriter is really helping me a lot! I absolutely agree, that in general keeping track of your data is way easier outside XlsxWriter. I just stumbled upon these internals while looking into an autofit solution for column widths, which does not require constant attention while filling the table... (In this answer, to be exact: stackoverflow.com/a/53091320).Basement
A safer, although probably more verbose, way of dealing with this problem would be to inherit from and override the default worksheet class and add in some autofit code, like this example: xlsxwriter.readthedocs.io/example_inheritance2.htmlBorg
B
3

Is it possible to extract data that I've written to a xlsxwriter.worksheet?

No. XlsxWriter is write only. If you need to keep track of your data you will need to do it in your own code, outside of XlsxWriter.

Borg answered 12/7, 2020 at 18:59 Comment(1)
While I agree that keeping track of your data in your code is easiest, I think the replies with the creator of xlsxwriter in @Basement answer to be worthwhile alongside this.Ida

© 2022 - 2024 — McMap. All rights reserved.