OpenXML SDK 2.5 unreadable content
Asked Answered
M

2

9

I am working on taking an existing Excel File which already has all of its formulas and formatting, I add data to the a sheet with a Table and when I then open that file in Excel I get the error

"Excel completed file level validation and repair. Some parts of this workbook may have been repaired or discarded. Removed Records: Cell information from /xl/worksheets/sheet6.xml part"

I then open a manually created file with the same content and it works just fine. I also discovered the Open XML 2.5 Productivity Tool which when I run validation on the generated file it says no issues found.

When I run a compare on the two files I see the Generated file looks like this.

        <x:c r="B462" t="inlineStr">
            <x:is>
                <x:t>1150828</x:t>
            </x:is>
        </x:c>

While the Manually created file has Cells that look like this.

       <c s="80" r="B462">
         <v>
           1150828
         </v>
       </c>

Obviously there is a difference here but I don't know how to correct it nor do I know if this difference is the actual cause of the error. but seeing as how everything else seems to look the same I don't know what else it could be.

Oh and couple more things this file is not working but I am able to use another file that does not contain a Table, when I incorporate a table the issue occurs so I at least know that much.

Also if you're going to suggest that I use ClosedXML please don't. I have used it and it tends to leave off formatting at random for some reason that I cannot figure out hence why I have moved to OpenXML SDk

Here is some of the C# Code

 dt.Load(reader);
                            RowCount = dt.Rows.Count;
                            ColumnCount = dt.Columns.Count;

                                workbookPart = spreadDoc.WorkbookPart;
                                SheetDimension sheetDimension = new SheetDimension() { Reference = "A1:" + ColumnLetters[ColumnCount - 1] + (RowCount + 1) };

                                worksheetPart = Program.GetWorksheetPart(workbookPart, reportStep.ExcelSheetName);
                                worksheetPart.Worksheet.SheetDimension = sheetDimension;

                                SheetData sheetData = worksheetPart.Worksheet.GetFirstChild<SheetData>();
                                string relId = workbookPart.Workbook.Descendants<Sheet>().First(s => reportStep.ExcelSheetName.Equals(s.Name)).Id;

                                if (reportStep.ExcelTableExists)
                                {
                                    TableDefinitionPart tableDef = null;

                                    int looper = 0;
                                    foreach (WorksheetPart wsp in spreadDoc.WorkbookPart.WorksheetParts)
                                    {
                                        if (wsp.TableDefinitionParts.Where(tbl => tbl.Table.DisplayName.Value.Equals(reportStep.ExcelTableName)).Count() == 1)
                                        {
                                            tableDef = spreadDoc.WorkbookPart.WorksheetParts.ElementAt(looper).TableDefinitionParts.Where(tbl => tbl.Table.DisplayName.Value.Equals(reportStep.ExcelTableName)).FirstOrDefault();
                                            tableDef.Table.Reference.Value = "A1:" + (ColumnLetters[ColumnCount - 1] + (RowCount +1) ).ToString();
                                            tableDef.Table.AutoFilter.Reference.Value = "A1:" + (ColumnLetters[ColumnCount - 1] + (RowCount +1)).ToString();
                                           // tabledefinitionPart = Program.GetTablePart(wsp, reportStep.ExcelTableName, ColumnCount, RowCount);
                                        }
                                        looper++;
                                    }


                                }

                                sheetData = Chef.Program.ExportDataTable(dt, sheetData);
                                Sheet sheet = workbookPart.Workbook.Descendants<Sheet>().FirstOrDefault(s => s.Name == reportStep.ExcelSheetName);









public static TableDefinitionPart GetTablePart(WorksheetPart worksheet, string tablename, int columnCount, int rowCount)
    {
        uint CellRange = (uint)(columnCount);
        TableColumns tableColumns1 = new TableColumns() { Count = (UInt32Value)(CellRange) };

        var tableDefPart = worksheet.TableDefinitionParts.Where(tbl => tbl.Table.DisplayName.Value.Equals(tablename)).FirstOrDefault();
            //worksheet.WorksheetPart.TableDefinitionParts.AddNewPart<TableDefinitionPart>(tablename);
        var table = new Table() { HeaderRowCount = (uint)columnCount, Name = tablename, DisplayName = tablename, Reference = "A1:" + ColumnLetters[columnCount -1] + (rowCount + 1), TotalsRowShown = false };
        TableStyleInfo tableStyleInfo1 = new TableStyleInfo()
        {
            Name = "TableStyleMedium2",
            ShowFirstColumn = false,
            ShowLastColumn = false,
            ShowRowStripes = true,
            ShowColumnStripes = false
        };

        table.Append(tableStyleInfo1);
       // table.Append(tableColumns1);
        tableDefPart.Table = table;
        return tableDefPart;
    }

EDIT SECTION ADDING IN THE ADDITIONAL METHODS REQUESTED Updated 9/5/15

I did remove the code that added the header values since they are already a part of the base template of the excel file. also removed the specifying of the cell datatype to preserve what the template already had the cell datatype set to.

       public static SheetData ExportDataTable2(System.Data.DataTable exportData, SheetData sheetData)
    {
        //loop through each data row  
        DataRow contentRow;
        int startRow = 2;
        for (int i = 0; i < exportData.Rows.Count; i++)
        {
            contentRow = exportData.Rows[i];
            sheetData.AppendChild(createContentRow(contentRow, i + startRow));
        }

        return sheetData;
    }


    private static Cell createTextCell(int columnIndex, int rowIndex, object cellValue)
    {
        Cell cell = new Cell();

       // cell.DataType = CellValues.Number;
        cell.CellReference = getColumnName(columnIndex) + rowIndex;
        cell.CellValue = new CellValue(cellValue.ToString());

        return cell;
    }

    private static Row createContentRow(DataRow dataRow, int rowIndex)
    {

        Row row = new Row
        {
            RowIndex = (UInt32)rowIndex
        };

        for (int i = 0; i < dataRow.Table.Columns.Count; i++)
        {
            Cell dataCell = createTextCell(i + 1, rowIndex, dataRow[i]);
          //  dataCell.DataType = CellValues.SharedString;
            row.AppendChild(dataCell);
        }

        return row;
    }
Mackey answered 28/8, 2015 at 18:15 Comment(5)
Can you post the code for your Chef.Program.ExportDataTable method please? The XML you've listed looks OK; the generated one is using inline strings but the manually created one is using the shared strings table.Jeanmariejeanna
you are on the right track I believe, I have added in the additional Methods to show how the data is being addedMackey
Actually, I think that cell value in manually created file is a number (not a reference to shared strings table), while the cell value in generated file is a (inline) string. Change your code so that you insert a number value into cell instead of an inline string. This is probably somehow related to Tables (since you said that generated file without Table works OK), perhaps Table requires cell value to be number.Instinct
So I actually changed the code the other night removing the line cell.DataType = CellValues.String; and although it works now in the aspect that the fields get populated and the calculations on the other sheets now work but I still get the error saying that unreadable content. if this was just for my own personal use than I would leave well enough alone but since these are files being sent to heads of my company I kind of need to figure out what is causing this. So from what I gather by not specifying the cell datatype it is just keeping what the template had as a datatype alreadyMackey
Also I have tracked down that the issue has to be with how the cell's data is being entered since I can open the file and save it without adding data to it and it works fine. I am updating the code above to reflect how it currently is after my revisionsMackey
A
5

Well, it appears that you have used the following example OpenXML SDK 2.0: Export a DataTable to Excel as a base for your code. Here is the original code for creating a cell:

private Cell createTextCell(int columnIndex, int rowIndex, object cellValue)
{
    Cell cell = new Cell();

    cell.DataType = CellValues.InlineString;
    cell.CellReference = getColumnName(columnIndex) + rowIndex;
    InlineString inlineString = new InlineString();
    Text t = new Text();

    t.Text = cellValue.ToString();
    inlineString.AppendChild(t);
    cell.AppendChild(inlineString);

    return cell;
}

Your original code was exactly the same except the following line:

cell.DataType = CellValues.String;

See the difference?

Then you have changed it to:

private static Cell createTextCell(int columnIndex, int rowIndex, object cellValue)
    {
        Cell cell = new Cell();

       // cell.DataType = CellValues.Number;
        cell.CellReference = getColumnName(columnIndex) + rowIndex;
        cell.CellValue = new CellValue(cellValue.ToString());

        return cell;
    }

Ok, the problem is that you don't set cell.DataType correctly. It needs to be in sync with the cell content, otherwise you'll get such an errors from Excel. In the former case you set content to inline string, but data type to String. In the later - data type to Number (it doesn't matter that you have commented the line - Number is the default data type for cells) but the content is not always a number (the same function is used for the column headears - after all, it's called createTextCell).

In order to fix the problem, either use the original code from the example, or this code:

private static Cell createTextCell(int columnIndex, int rowIndex, object cellValue)
{
    Cell cell = new Cell();
    cell.DataType = CellValues.String;
    cell.CellReference = getColumnName(columnIndex) + rowIndex;
    cell.CellValue = new CellValue(cellValue.ToString());
    return cell;
}

Finally, if you need to store a shared string, number, date etc., read the documentation and set the appropriate properties. I would aggree that the OpenXml API is not very intuitive, but this is what we have.

EDIT: Based on your comments, seems like your real problem is not exactly the one in question. Here is a high performance example of exporting DataTable with different data type columns:

public static class ExcelExporter
{
    public static void ExportDataTable(DataTable table, SheetData data)
    {
        var cellFactory = new CellFactory[table.Columns.Count];
        for (int i = 0; i < table.Columns.Count; i++)
            cellFactory[i] = GetCellFactory(table.Columns[i].DataType);
        int rowIndex = 0;
        data.AppendChild(CreateHeaderRow(rowIndex++, table));
        for (int i = 0; i < table.Rows.Count; i++)
            data.AppendChild(CreateContentRow(rowIndex++, table.Rows[i], cellFactory));
    }
    private static Row CreateHeaderRow(int rowIndex, DataTable table)
    {
        var row = CreateRow(rowIndex);
        for (int i = 0; i < table.Columns.Count; i++)
        {
            var cell = CreateTextCell(i, rowIndex, table.Columns[i].ColumnName);
            row.AppendChild(cell);
        }
        return row;
    }
    private static Row CreateContentRow(int rowIndex, DataRow dataRow, CellFactory[] cellFactory)
    {
        var row = CreateRow(rowIndex);
        for (int i = 0; i < dataRow.Table.Columns.Count; i++)
        {
            var cell = cellFactory[i](i, rowIndex, dataRow[i]);
            row.AppendChild(cell);
        }
        return row;
    }
    private static Row CreateRow(int index) { return new Row { RowIndex = (uint)index + 1 }; }
    private delegate Cell CellFactory(int columnIndex, int rowIndex, object cellValue);
    private static CellFactory GetCellFactory(Type dataType)
    {
        CellFactory factory;
        return CellFactoryMap.TryGetValue(dataType, out factory) ? factory : TextCellFactory;
    }
    private static readonly CellFactory TextCellFactory = CreateTextCell;
    private static readonly CellFactory DateCellFactory = CreateDateCell;
    private static readonly CellFactory NumericCellFactory = CreateNumericCell;
    private static readonly CellFactory BooleanCellFactory = CreateBooleanCell;
    private static readonly Dictionary<Type, CellFactory> CellFactoryMap = new Dictionary<Type, CellFactory>
    {
        { typeof(bool), BooleanCellFactory },
        { typeof(DateTime), DateCellFactory },
        { typeof(byte), NumericCellFactory },
        { typeof(sbyte), NumericCellFactory },
        { typeof(short), NumericCellFactory },
        { typeof(ushort), NumericCellFactory },
        { typeof(int), NumericCellFactory },
        { typeof(uint), NumericCellFactory },
        { typeof(long), NumericCellFactory },
        { typeof(ulong), NumericCellFactory },
        { typeof(float), NumericCellFactory },
        { typeof(double), NumericCellFactory },
        { typeof(decimal), NumericCellFactory },
    };
    private static Cell CreateTextCell(int columnIndex, int rowIndex, object cellValue)
    {
        return CreateCell(CellValues.String, columnIndex, rowIndex, ToExcelValue(cellValue));
    }
    private static Cell CreateDateCell(int columnIndex, int rowIndex, object cellValue)
    {
        // NOTE: CellValues.Date is not supported in older Excel version.
        // In all Excel versions dates can be stored with CellValues.Number and a format style.
        // Since I have no styles, will export them just as text
        //var cell = CreateCell(CellValues.Number, columnIndex, rowIndex, ToExcelDate(cellValue));
        //cell.StyleIndex = ...;
        //return cell;
        return CreateCell(CellValues.String, columnIndex, rowIndex, 
            cellValue != null && cellValue != DBNull.Value ? ((DateTime)cellValue).ToShortDateString() : null);
    }
    private static Cell CreateNumericCell(int columnIndex, int rowIndex, object cellValue)
    {
        return CreateCell(CellValues.Number, columnIndex, rowIndex, ToExcelValue(cellValue));
    }
    private static Cell CreateBooleanCell(int columnIndex, int rowIndex, object cellValue)
    {
        // NOTE: CellValues.Boolean is not supported in older Excel version
        //return CreateCell(CellValues.Boolean, columnIndex, rowIndex, ToExcelValue(cellValue));
        return CreateCell(CellValues.String, columnIndex, rowIndex, ToExcelValue(cellValue));
    }
    private static Cell CreateCell(CellValues dataType, int columnIndex, int rowIndex, string cellValue)
    {
        var cell = new Cell();
        if (dataType != CellValues.Number) cell.DataType = dataType;
        cell.CellReference = GetColumnName(columnIndex) + (rowIndex + 1);
        cell.CellValue = new CellValue(cellValue ?? string.Empty);
        return cell;
    }
    private static string ToExcelValue(object value)
    {
        if (value == null || value == DBNull.Value) return null;
        return Convert.ToString(value, CultureInfo.InvariantCulture);
    }
    private static DateTime ExcelBaseDate = new DateTime(1900, 1, 1);
    private static string ToExcelDate(object value)
    {
        const int days29Feb1900 = 59;
        if (value == null || value == DBNull.Value) return null;
        var date = ((DateTime)value).Date;
        var days = (date - ExcelBaseDate).Days + 1;
        if (days >= days29Feb1900) days++;
        return days.ToString(CultureInfo.InvariantCulture);
    }
    private static string GetColumnName(int index) { return ColumnNameTable[index]; }
    private static readonly string[] ColumnNameTable = BuildColumnNameTable();
    private static string[] BuildColumnNameTable()
    {
        var table = new string[16384];
        var sb = new StringBuilder();
        for (int i = 0; i < table.Length; i++)
            table[i] = sb.BuildColumnName(i);
        return table;
    }
    private static string BuildColumnName(this StringBuilder sb, int index)
    {
        const int startLetter = 'A';
        const int letterCount = 'Z' - startLetter + 1;
        sb.Clear();
        while (true)
        {
            var letter = (char)(startLetter + (index % letterCount));
            sb.Insert(0, letter);
            if (index < letterCount) break;
            index = (index / letterCount) - 1;
        }
        return sb.ToString();
    }
}

The key point is, instead of checking the type of every value during the processing, prepare at the beginning a different create cell method for each column based on it's data type.

Augment answered 7/9, 2015 at 19:13 Comment(14)
here is the thing, if I comment out the line cell.DataType = CellValues.String; the file generates and errors but still works after Excel does its repair.Mackey
if I keep that line than it does not error but it's calculations do not work unless I do the Text To Columns trick in Excel to each of the columns in the table. neither of these options are acceptableMackey
I just made this change private static Cell createTextCell2(int columnIndex, int rowIndex, object cellValue) { Cell cell = new Cell(); if(cellValue is string) { cell.DataType = CellValues.String; } if(cellValue is int) { cell.DataType = CellValues.Number; }Mackey
Well, read the last paragraph from my answer. Your question was how to avoid Excel error, and I provided you the solution. The function is called createTextCell, so was the fix. If you need to use it for a different purpose, better rename it to CreateCell and put whatever if statements you want inside. Commenting a code to "fix" some case is not a good practice and almost always breaks other cases as we see. The SDK sample you used is too primitive, but the word Text in the function name is a clear indication that the function is supposed to write text and nothing else.Augment
You are correct I used the code as a baseline since I have never used the SDK before, you are also correct that the naming convention should be changed since it is misleading. Although the if statement fixed one report the following report is coming up with similar error messages. This one appears to be related to a date field. unfortunately though when I check to see if the value is of a date format it still gives me the error. Plus now with putting the checks in place a report that did take roughly 2 seconds to run is now taking upwards of 10 minutes.Mackey
One thing I have been trying to do is NOT have to specify the format of the data and just keep the format that was saved in the excel template file that I am using. if you know of a way to just enter the data without having to specify the data type that would be the most helpful information (I would assume that is)Mackey
I think you can't, because you are creating new cells. What types of data are you supporting - string, int, double(?), decimal(?), date(?) etc? and are you using DataTable as a source in your real code?Augment
Supporting all the standard data types you mention with main focus being on string, int and date. yes I am querying the database returning the result in a datatable and passing the datatable to ExportDataTable2Mackey
Can you wait for tomorrow? Because it's almost midnight here. I think I can prepare a better example for exporting DataTable with different data types with high performance.Augment
Put me a note when you get back. And also, in order to not waste time, let me know if you have any constraints like .NET framework version, C# (VS) version etc.Augment
here is the App Info <?xml version="1.0" encoding="utf-8"?> <configuration> <startup> <supportedRuntime version="v4.0" sku=".NETFramework,Version=v4.5" /> </startup> <runtime> <assemblyBinding xmlns="urn:schemas-microsoft-com:asm.v1"> <dependentAssembly> <assemblyIdentity name="DocumentFormat.OpenXml" publicKeyToken="31bf3856ad364e35" culture="neutral" /> <bindingRedirect oldVersion="0.0.0.0-2.5.5631.0" newVersion="2.5.5631.0" /> </dependentAssembly> </assemblyBinding> </runtime> </configuration>Mackey
Due to going into surgery first thing in the morning I will check this late this coming weekend and get back to you. Thank you for your responseMackey
I am marking this as resolved, I only changed one thing, Apparently since my "template" excel files already have the headers to the tables in them when using the CreateHeaderRow it gives me an error so I modified it below.Mackey
public static void ExportDataTable(DataTable table, SheetData data) { var cellFactory = new CellFactory[table.Columns.Count]; for (int i = 0; i < table.Columns.Count; i++) cellFactory[i] = GetCellFactory(table.Columns[i].DataType); int rowIndex = 0; // data.AppendChild(CreateHeaderRow(rowIndex++, table)); rowIndex++; for (int i = 0; i < table.Rows.Count; i++) data.AppendChild(CreateContentRow(rowIndex++, table.Rows[i], cellFactory)); }Mackey
D
0

I also had problems regarding invalid files working with the OpenXml Sdk. Have a look at the OpenXml Power Tools; they've solved all my issues :) Also you should switch to the OpenXml Sdk 2.6 to avoid problems with System.IO.Packaging. I hope this helps!

Djambi answered 31/8, 2015 at 9:25 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.