Editing custom XML part in word document sometimes corrupts document
Asked Answered
O

3

0

We have a system that stores some custom templating data in a Word document. Sometimes, updating this data causes Word to complain that the document is corrupted. When that happens, if I unzip the docx file and compare the contents to the previous version, the only difference appears to be the expected change in the customXML\item.xml file. If I re-zip the contents using 7zip, it seems to work OK (Word no longer complains that the document is corrupt).

The (simplified) code:

void CreateOrReplaceCustomXml(string filename, MyCustomData data)
{
    using (var doc = WordProcessingDocument.Open(filename, true))
    {
        var part = GetCustomXmlParts(doc).SingleOrDefault();
        if (part == null)
        {
            part = doc.MainDocumentPart.AddCustomXmlPart(CustomXmlPartType.CustomXml);
        }

        var serializer = new DataContractSerializer(typeof(MyCustomData));
        using (var stream = new MemoryStream())
        {
            serializer.WriteObject(stream, data);
            stream.Seek(0, SeekOrigin.Begin);
            part.FeedData(stream);
        }
    }
}

IEnumerable<CustomXmlPart> GetCustomXmlParts(WordProcessingDocument doc)
{
    return doc.MainDocumentPart.CustomXmlParts
        .Where(part =>
        {
            using (var stream = doc.Package.GePart(c.Uri).GetStream())
            using (var streamReader = new StreamReader(stream))
            {
                return streamReader.ReadToEnd().Contains("Some.Namespace");
            }
        });
}

Any suggestions?

Olag answered 15/5, 2014 at 8:27 Comment(1)
Would it possible to post a reproducing sample?Eversole
G
1

Since re-zipping works, it seems the content is well-formed.

So it sounds like the zip process is at fault. So open the corrupted docx in 7-Zip, and take note of the values in the "method" column (especially for customXML\item.xml).

Compare that value to a working docx - is it the same or different? Method "Deflate" works.

Gunter answered 17/5, 2014 at 21:59 Comment(3)
the same - everything is "Deflate"Olag
OK, good. I'd focus on your C# code then. Does the problem occur when the update makes the data shorter, or longer, or is there no pattern? Is the packed size (reported in 7-Zip) the same before and after re-zip?Gunter
this helped me figure out my problem, so happy to award it with rep :) My actual problem was to do with writing the finished word document to file...Olag
Z
0

I faced the same issue and it turned out it was due to encoding. Do you already specify the same encoding when serializing/deserializing?

Zsazsa answered 22/5, 2014 at 14:9 Comment(0)
R
0

Couple of suggestion a. Try doc.Package.Flush(); after you write the data back into the custom xml. b. You may have to delete all custom part and add a new custom part. We are using the following code and it seems working fine.

public static void ReplaceCustomXML(WordprocessingDocument myDoc, string customXML)
    {

        MainDocumentPart mainPart = myDoc.MainDocumentPart;
        mainPart.DeleteParts<CustomXmlPart>(mainPart.CustomXmlParts);
        CustomXmlPart customXmlPart =     mainPart.AddCustomXmlPart(CustomXmlPartType.CustomXml);
        using (StreamWriter ts = new StreamWriter(customXmlPart.GetStream()))
        {
            ts.Write(customXML);
            ts.Flush();
            ts.Close();
        }
    }

public static MemoryStream GetCustomXmlPart(MainDocumentPart mainPart)
    {
        foreach (CustomXmlPart part in mainPart.CustomXmlParts)
        {
            using (XmlTextReader reader =
                new XmlTextReader(part.GetStream(FileMode.Open, FileAccess.Read)))
            {
                reader.MoveToContent();
                if (reader.Name.Equals("aaaa", StringComparison.OrdinalIgnoreCase))
                {
                    string str = reader.ReadOuterXml();
                    byte[] byteArray = Encoding.ASCII.GetBytes(str);
                    MemoryStream stream = new MemoryStream(byteArray);

                    return stream;
                }
            }
        }

        return null; //result;
    }

using (WordprocessingDocument myDoc = WordprocessingDocument.Open(ms, true))
                {
                    StreamReader reader = new StreamReader(memStream);
                    string FullXML = reader.ReadToEnd();
                    ReplaceCustomXML(myDoc, FullXML);

                    myDoc.Package.Flush();

                    //Code to save file
                }
Rothstein answered 23/5, 2014 at 13:46 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.