Merging two .odt files from code
Asked Answered
H

2

10

How do you merge two .odt files? Doing that by hand, opening each file and copying the content would work, but is unfeasable.

I have tried odttoolkit Simple API (simple-odf-0.8.1-incubating) to achieve that task, creating an empty TextDocument and merging everything into it:

private File masterFile = new File(...);

...

TextDocument t = TextDocument.newTextDocument();
t.save(masterFile);

...

for(File f : filesToMerge){
   joinOdt(f);
}

...

void joinOdt(File joinee){
   TextDocument master = (TextDocument) TextDocument.loadDocument(masterFile);
   TextDocument slave = (TextDocument) TextDocument.loadDocument(joinee);
   master.insertContentFromDocumentAfter(slave, master.getParagraphByReverseIndex(0, false), true);
   master.save(masterFile);
}

And that works reasonably well, however it looses information about fonts - original files are a combination of Arial Narrow and Windings (for check boxes), output masterFile is all in TimesNewRoman. At first I suspected last parameter of insertContentFromDocumentAfter, but changing it to false breaks (almost) all formatting. Am I doing something wrong? Is there any other way?

Hilde answered 8/8, 2014 at 9:57 Comment(4)
I am trying to achieve the same but cannot get that version of the library from mavenUnhealthy
@Unhealthy Yup, that version is not available on maven. Because in half a year since that version came out they still didn't fix their building process. I downloaded it manually as a part of odftoolkit-0.6.1 and installed it in my local repo from jar.Hilde
Currently I have the same problem to solve and I see that there are only two good java libraries for this: Apache ODF Toolkit and jOpenDocument. Which one is better ?Kamat
@Kamat The above is part of Apache Toolkit, I don't know the other one. If you try it and succeed at this task be sure to let me know.Hilde
L
1

I think this is "works as designed".

I tried this once with a global document, which imports documents and display them as is... as long as paragraph styles have different names !

Using same named templates are overwritten with the values the "master" document have.

So I ended up cloning standard styles with unique (per document) names.

HTH

Leitman answered 9/8, 2014 at 8:39 Comment(8)
I'm not savvy enough about inner workings of .odt... But would that mean, that if my files were created in the same way when it comes to styles, starting off one of them instead of empty document would make the merge work as I'd like it to work?Hilde
Create paragraph styles NOT based on existing (standard, pre-defined) ones.Leitman
My workflow consists of converting multiple RTF files into a series of .odt files (using Java UNO API), and than merging them, so I don't think that's a viable solution.Hilde
So, when you convert from RTF - you have to respect different styles (fonts, size...): when doing this step, do not alter pre-defined styles, but create new ones , not based on existing styles. Using new, "stand-alone" styles will keep the attributes you set when importing (Insert>File... / insertContentFromDocument) in a new created document (or a global document). Merging docs: the defined styles of "new doc" will override, alter customizations when importing (as observed)Leitman
Than I would love a link to tutorial on how to convert from RTF in that way (my method consists of simply using built in converter, something like #17200615 with "writerglobal8" as the name of the filter). Also, while useful and showing the problem with the approach I took, I'm afraid that your answer does not answer the question of how to do it correctlyHilde
I'm afraid there is no tutorial on that. And I can only say what I observed: only way around is to create new paragraph styles, not based on existing, before merging documents together... to prevent style attributes getting ovrewritten.Leitman
If there is a way to do that from Java code by editing existing .odt files, that would answer my question.Hilde
I can provide you with StarBasic code - but I recon you can convert this to Java easily (just loop over every style, checking "isInUse"): SUB break_Style_inheritance oDoc = ThisComponent oParaStyles = oDoc.StyleFamilies.getByName("ParagraphStyles") oMyStyle = oParaStyles.getByName("Textkoerper") oMyStyle.setParentStyle("") END SUB Hint: get the inspection tool MRI - it can produce Java code, if neededLeitman
H
1

Ma case was a rather simple one, files I wanted to merge were generated the same way and used the same basic formatting. Therefore, starting off of one of my files, instead of an empty document fixed my problem.

However this question will remain open until someone comes up with a more general solution to formatting retention (possibly based on ngulams answer and comments?).

Hilde answered 12/8, 2014 at 8:25 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.