LibreOffice: recursively convert documents in folders and subfolders
Asked Answered
S

1

5

I have stored many files in the doc format which I can only open with libreoffice on my mac.

The command:

 ./soffice --headless --convert-to docx --outdir /home/user ~/Downloads/*.doc

does exactly what it should: it converts all the *.doc files to libreoffice *.docx file.

My problem is that I have folders and subfolders with these files.

Is there any way to search through all the folders from a starting directory and to let "soffice" do its job in each of these folders, storing the new versions (*.docx) exactly where the original (*.doc) was found.

Unfortunately, I am not well-versed in apple script or in terminal to make this work. Yet there are 8000 doc files in hundreds and hundreds of folders that require the update to docx.

Thanks for your help.

Shcherbakov answered 20/12, 2020 at 14:44 Comment(0)
P
7

Is there any way to search through all the folders from a starting directory and to let "soffice" do its job in each of these folders, storing the new versions (.docx) exactly where the original (.doc) was found.

This can actually be done in Terminal using a find command:

find '/path/to/directory' -type f -iname '*.doc' -execdir /Applications/LibreOffice.app/Contents/MacOS/soffice --headless --convert-to docx '{}' \;

In the example command above, change '/path/to/directory' to the actual pathname of the target directory that contains the subdirectories containing the .doc files.

What this find command does, in brief, is it finds all .doc files within the hierarchical directory structure of the '/path/to/directory' and executes the command in each subdirectory within containing the .doc files and converts each one in place in its subdirectory.


Notes:

  • Always insure you have proper backups before running commands in Terminal and the commands as typed will produce the wanted behavior!

  • This command assumes no .docx files already exist of the same name as the .doc files in the subdirectories, as it automatically overwrites any existing .docx files of the same name as the .doc files. As far as I can tell, the soffice command does not provide an option not to overwrite existing files. If this is going to be an issue, then a different solution will be necessary.

Poker answered 21/12, 2020 at 1:16 Comment(1)
I have been looking for a way to batch edit textfiles from doc to docx for years and the solution is so elegant. I only had to edit the "path/to/directory" and it worked! It took my two days to work through 8000 files, the remaining 8000 were done in ten minutes! You are brillant! (And I guess I shouldn't write this here, but my upvotes aren't counted yet, so forgive me.) Thank you! Merry X-Mas!Shcherbakov

© 2022 - 2024 — McMap. All rights reserved.