I want to add and extract files from an Office/Excel document using Python. So far adding things is easy but for extracting I haven't found a clean solution.
To make clear what I've got and what not I've written the small example test.py below and explain further.
test.py
import win32com.client as win32
import os
from tkinter import messagebox
import win32clipboard
# (0) Setup
dir_path = os.path.dirname(os.path.realpath(__file__))
print(dir_path)
excel = win32.gencache.EnsureDispatch('Excel.Application')
wb = excel.Workbooks.Open(dir_path + "\\" + "test_excel.xlsx")
ws = wb.Worksheets.Item(1)
objs = ws.OLEObjects()
# (1) Embed file
f = dir_path + "\\" + "test_txt.txt"
name = "test_txt_ole.txt"
objs.Add( Filename=f, IconLabel=name )
# (2) Access embedded file
obj = objs.Item(1) # Get single OLE from OLE list
obj.Copy()
win32clipboard.OpenClipboard()
data = win32clipboard.GetClipboardData(0xC004) # Binary access
win32clipboard.EmptyClipboard()
win32clipboard.CloseClipboard()
messagebox.showinfo(title="test_txt_ole.txt", message=str(data))
# (3) Press don't save here to keep
# wb.Close() # Will close excel document and leave excel opened.
excel.Application.Quit() # Will close excel with all opened documents
For preparation (step 0) it opens a given excel document with one worksheet that was create before by using new document button in excel.
In step (1) it uses API to embed a given text file to the excel document. The text file was created before with content "TEST123" using a text editor.
Afterwards in step (2) it tries to read back content from embedded OLE using clipboard and opens a message box that shows the content from OLE in clipboard.
Finally (3) the program closes the opened document. To keep an unchanged setup press no here.
The big disadvantage of this solution is the use of clipboard which smashes any user content in clipboard which is bad style in a productive environment. Further it uses an undocumented option for clipboard.
A better solution would be to safe OLE or OLE embedded file to a python data container or to a file of my choice. In my example I've used a TXT file to easily identify file data. Finally I'll use ZIP for an all-in-one solution but a TXT file solution would be sufficient for base64 data.
Source of 0xC004 = 49156: https://danny.fyi/embedding-and-accessing-a-file-in-excel-with-vba-and-ole-objects-4d4e7863cfff
This VBA example look interesting but I have no clue about VBA: Saving embedded OLE Object (Excel workbook) to file in Excel 2010
Workbooks.Open
,Worksheets.Item
, etc. are kindof VBA commands (technically they are IDispatch calls). Why don't you just try to call something likeoEmbFile.Object.SaveAs fName
in the VBA example directly from your python code – Jackson