Using a template with OpenXML and SAX
Asked Answered
F

1

6

I'm creating a large XLSX file from a datatable, using the SAX method proposed in Parsing and Reading Large Excel Files with the Open XML SDK. I'm using an XLSX file as a template.

The method described in that post works fine to substitute a new sheet in for an existing one, but I want to copy the header row from the sheet in the template (string values, formatting, etc), instead of just using the header row from the datatable as the original code does.

I've tried the code below, but the XLSX file ends up with no text in the header row - the formatting is copied, just not the text. I've looked in the XML file for the sheet and it looks OK to me (referencing the sharedStrings.xml file, which still has the definition of the strings). The reflected code from the Open XML SDK 2.0 Productivity Tool shows a slightly odd result though: the cells don't appear to have a text value set:

cellValue1.Text = "";

even though the XML says:

<x:c r="A1" s="4" t="s">

The main code used by the OpenXmlReader is below:

while (reader.Read())
{
    if (reader.ElementType == typeof(SheetData))
    {
        if (reader.IsEndElement)
            continue;

        // Write sheet element
        writer.WriteStartElement(new SheetData());

        // copy header row from template
        reader.Read();
        do
        {
            if (reader.IsStartElement)
            {
                writer.WriteStartElement(reader);
                        }
            else if (reader.IsEndElement)
            {
                writer.WriteEndElement();
            }
            reader.Read();
        } while (!(reader.ElementType == typeof(Row) && reader.IsEndElement));
        writer.WriteEndElement();

        // Write data rows
        foreach (DataRow dataRow in resultsTable.Rows)
        {
            // Write row element
            Row r = new Row();
            writer.WriteStartElement(r);

            foreach (DataColumn dataCol in resultsTable.Columns)
            {
                Cell c = new Cell();
                c.DataType = CellValues.String;
                CellValue v = new CellValue(dataRow[dataCol].ToString());
                c.Append(v);

                // Write cell element
                writer.WriteElement(c);
            }

            // End row
            writer.WriteEndElement();
        }

        // End sheet
        writer.WriteEndElement();
    }
    else
    {
        if (reader.IsStartElement)
        {
            writer.WriteStartElement(reader);
        }
        else if (reader.IsEndElement)
        {
            writer.WriteEndElement();
        }
    }
}
Fishback answered 4/10, 2011 at 17:12 Comment(1)
Does this really handle all data types (number, boolean, shared string, date, timespan)? "c.DataType = CellValues.String;"Atalante
F
9

The clue was that the Productivity Tool was showing blank values for the header cells on the generated sheet, and also that the validation formulae from the template were missing. These are both text, which wasn't copied from the template sheet to the new sheet using the combination of OpenXmlReader.Read() and OpenXmlReader.WriteStartElement().

When the element is an OpenXmlLeafTextElement then the OpenXmlReader.GetText() method will return the text - this works for both text values in cells and for formulae.

The working code is shown below:

while (openXmlReader.Read())
{
    if (openXmlReader.ElementType == typeof(SheetData))
    {
        if (openXmlReader.IsEndElement)
            continue;

        // write sheet element
        openXmlWriter.WriteStartElement(new SheetData());

        // read first row from template and copy into the new sheet
        openXmlReader.Read();

        do
        {
            if (openXmlReader.IsStartElement)
            {
                openXmlWriter.WriteStartElement(openXmlReader);
                
                // this bit is needed to get cell values
                if (openXmlReader.ElementType.IsSubclassOf(typeof(OpenXmlLeafTextElement)))
                {
                    openXmlWriter.WriteString(openXmlReader.GetText());
                }
            }
            else if (openXmlReader.IsEndElement)
            {
                openXmlWriter.WriteEndElement();
            }

            openXmlReader.Read();

        } while (!(openXmlReader.ElementType == typeof(Row) && openXmlReader.IsEndElement));

        openXmlWriter.WriteEndElement();

        // write data rows
        foreach (DataRow dataRow in resultsTable.Rows)
        {
            // write row element
            Row r = new Row();

            openXmlWriter.WriteStartElement(r);

            foreach (DataColumn dataCol in resultsTable.Columns)
            {
                Cell c = new Cell();
                c.DataType = CellValues.String;
                CellValue v = new CellValue(dataRow[dataCol].ToString());
                c.Append(v);

                // write cell element
                openXmlWriter.WriteElement(c);
            }

            // end row
            openXmlWriter.WriteEndElement();
        }

        // end sheet
        openXmlWriter.WriteEndElement();
    }
    else
    {
        if (openXmlReader.IsStartElement)
        {
            openXmlWriter.WriteStartElement(openXmlReader);

            // this bit is needed to get formulae and that kind of thing
            if (openXmlReader.ElementType.IsSubclassOf(typeof(OpenXmlLeafTextElement)))
            {
                openXmlWriter.WriteString(openXmlReader.GetText());
            }
        }
        else if (openXmlReader.IsEndElement)
        {
            openXmlWriter.WriteEndElement();
        }
    }
}
Fishback answered 5/10, 2011 at 13:21 Comment(2)
What is the writer Object?Nonpareil
The writer is an OpenXmlWriter - I've changed the variable names to make it clearerFishback

© 2022 - 2024 — McMap. All rights reserved.