spreadsheet to python dictionary conversion
Asked Answered
G

7

12

I am working on python and I want to read an *.ods file and convert it to a python dictionary.

The key will be the first column value and the value will be second column value.

How can I do it? I used xlrd but it does not read *.ods files.

Grum answered 20/1, 2011 at 8:47 Comment(4)
pypi.python.org/pypi/ezodf or udk.openoffice.org/python/python-bridge.htmlExtremity
@marcog: This comment should be an answer.Dupery
@Space I was just dropping a possibility that I didn't really read up on much, so I don't feel it's a complete answer.Extremity
@marcog: ezodf has promise but started only recently (version 0.2.0 alpha) and doesn't handle .ods files yet apart from opening and saving files, certainly doesn't know about tables (sheets), rows and cells yet. Also it's Python 3.x only.Doctor
D
8

Some available options:

  • pyexcel-ods: "A wrapper library to read, manipulate and write data in ods format." Can be installed via: pip install pyexcel-ods. I personally recommend this package as I've used it and it is being actively maintained.

  • py-odftools: "... a collection of tools for analyzing, converting and creating files in the ISO standard OpenDocument format." This project hasn't been updated since late 2007. It looks abandoned.

  • ezodf: "A Python package to create/manipulate OpenDocumentFormat files." Installable via pip install ezodf. See caveat in the comments below about a serious issue with this package.

Distortion answered 7/4, 2015 at 17:14 Comment(2)
Beware: the last package (ezodf), contains a serious bug that causes it to silently return wrong cell values: github.com/T0ha/ezodf/issues/12. Thanks Mateusz for bringing it to my attention.Distortion
github.com/pyexcel/pyexcel-ods3 says " pyexcel-ods3 reads, manipulates and writes data in ods formats using ezodf"Ferromanganese
S
3

Although you could ask your users to File>Save As (as you probably know), this might not be useful in your situation.

It's probably easier to use the libre/openoffice service. It can be run completely headless on a server without needing X11 installed or running, and that will give you a clean native conversion.

libreoffice --without-x --convert-to csv  filename.ods

Check libreoffice --help (or openoffice --help) for details. This could also be wrapped in os.system(), subprocess.*(), etc. (Note: use -convert-to on Windows.) Also note: you cannot already be running any instances of Libre/Open/Star office, including the quickstarter.

Update: prior versions of LibreOffice used --headless instead of --without-x.

Shurlocke answered 8/4, 2015 at 22:18 Comment(4)
Note: --headless is required for automated operation.Shurlocke
(I'd sure love to know how to specify an individual sheet.)Shurlocke
If you don't even want the X11 libraries linked in, here's how to do a headless build: wiki.documentfoundation.org/Development/HeadlessBuildShurlocke
In Libreoffice 4.4 and later, --headless is now --without-x. (windows: -without-x)Shurlocke
O
2

Can you convert the .ODS to a csv first? Then parsing CSV using Python is pretty easy using the csv module.

Occlusive answered 20/1, 2011 at 8:58 Comment(1)
thanks but as i know there is only way for convert ods to cvs is open office service. But i dont want to use it. can i directly convert it into dictionaryGrum
R
2

Check py-odftools.

Religieuse answered 20/1, 2011 at 8:58 Comment(0)
H
1

This approach from the link below works awesomely for me reading/loading *.ods files into python dataframe. You can choose to load by sheet index or by sheet name.

Peeped my solution from this project: https://pypi.org/project/pandas-ods-reader/

You might first need to install these dependencies: ezodf,lxml and pandas before continuing.

pip install pandas_ods_reader

from pandas_ods_reader import read_ods

Then:

filepath = "path/to/your/file.ods"

Doing loading of sheets based on indices (index 1 based)

sheet_idx = 1
df = read_ods(filepath, sheet_idx)

Doing loading of sheets based on sheet names

sheet_name = "sales_year_1"

df = read_ods(filepath, sheet_name)

Done.

Heartbeat answered 14/9, 2019 at 11:52 Comment(0)
R
0

There's a great article on Linux Journal how to read ods in python. Ods file is a juz zip file containing xml file inside. You can than parse xml file to read all cells.

http://www.linuxjournal.com/article/9347?page=0,2

Raimund answered 20/1, 2011 at 9:16 Comment(1)
"how to read ods in python" gets only as far as summarising a subset of the possible interesting tags e.g. it mentions <table:table-cell> but gives no clue at all about its bizarre contents and doesn't tell you that you have to do very careful book-keeping to keep track of what (row, column) the cell belongs to (it's not recorded in the cell attributes!!). Parsing the XML file correctly is NOT something that one should blithely suggest in answer to such a question. Have YOU tried this yourself? Would you like me to send you a test file off-line?Doctor
B
0

Using Odio you can do:

import odio

with open('test.ods', 'rb') as f:
    sheet = odio.parse_spreadsheet(f)

table = sheet.tables[0]
print(table.name)

for row in table.rows:
    print(row)
Belie answered 2/6, 2022 at 19:14 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.