I also was trying to figure out how to properly iterate through the objects and get at the data that i need with this API.
I collected info from various posts and the getting started page from the author and put it all together to assist myself and others.
The main issue is your entry point for iteration. Most solutions I've seen go after the Worksheet, whereas this question is specific on the Table, i was curious about both so i'm presenting my findings on both.
Worksheet Example:
using (var package = new ExcelPackage(new FileInfo(file)))
{
//what i've seen used the most, entry point is the worksheet not the table w/i the worksheet(s)
using (var worksheet = package.Workbook.Worksheets.FirstOrDefault())
{
if (worksheet != null)
{
for (int rowIndex = worksheet.Dimension.Start.Row; rowIndex <= worksheet.Dimension.End.Row; rowIndex++)
{
var row = worksheet.Row(rowIndex);
//from comments here... https://github.com/JanKallman/EPPlus/wiki/Addressing-a-worksheet
//#:# gets entire row, A:A gets entire column
var rowCells = worksheet.Cells[$"{rowIndex}:{rowIndex}"];
//returns System.Object[,]
//type is string so it likely detects many cells and doesn't know how you want the many formatted together...
var rowCellsText = rowCells.Text;
var rowCellsTextMany = string.Join(", ", rowCells.Select(x => x.Text));
var allEmptyColumnsInRow = rowCells.All(x => string.IsNullOrWhiteSpace(x.Text));
var firstCellInRowWithText = rowCells.Where(x => !string.IsNullOrWhiteSpace(x.Text)).FirstOrDefault();
var firstCellInRowWithTextText = firstCellInRowWithText?.Text;
var firstCellFromRow = rowCells[rowIndex, worksheet.Dimension.Start.Column];
var firstCellFromRowText = firstCellFromRow.Text;
//throws exception...
//var badRow = rowCells[worksheet.Dimension.Start.Row - 1, worksheet.Dimension.Start.Column - 1];
//for me this happened on row1 + row2 beign merged together for the column headers
//not sure why the row.merged property is false for both rows though
if (allEmptyColumnsInRow)
continue;
for (int columnIndex = worksheet.Dimension.Start.Column; columnIndex <= worksheet.Dimension.End.Column; columnIndex++)
{
var column = worksheet.Column(columnIndex);
var currentRowColumn = worksheet.Cells[rowIndex, columnIndex];
var currentRowColumnText = currentRowColumn.Text;
var currentRowColumnAddress = currentRowColumn.Address;
//likely won't need to do this, but i wanted to show you can tangent off at any level w/ that info via another call
//similar to row, doing A:A or B:B here, address is A# so just get first char from address
var columnCells = worksheet.Cells[$"{currentRowColumnAddress[0]}:{currentRowColumnAddress[0]}"];
var columnCellsTextMany = string.Join(", ", columnCells.Select(x => x.Text));
var allEmptyRowsInColumn = columnCells.All(x => string.IsNullOrWhiteSpace(x.Text));
var firstCellInColumnWithText = columnCells.Where(x => !string.IsNullOrWhiteSpace(x.Text)).FirstOrDefault();
var firstCellInColumnWithTextText = firstCellInColumnWithText?.Text;
}
}
}
}
}
Now things can get a bit messed up here, for me at least i had no tables to start with. Under the same package using statement, if i were to first iterate over the worksheet cells and then touch anything with the Tables property it threw an exception. If i re-instantiate a package and use the same/similar code it doesn't blow up when seeing if we have any Tables or not.
Table Example:
//for some reason, if i don't instantiating another package and i work with the 'Tables' property in any way, the API throws a...
//Object reference not set to an instance of an object.
//at OfficeOpenXml.ExcelWorksheet.get_Tables()
//excetion... this is because i have data in my worksheet but not an actual 'table' (Excel => Insert => Table)
//a parital load of worksheet cell data + invoke to get non-existing tables must have a bug as below code does not
//throw an exception and detects null gracefully on firstordefault
using (var package = new ExcelPackage(new FileInfo(file)))
{
//however, question was about a table, so lets also look at that... should be the same?
//no IDisposable? :(
//adding a table manually to my worksheet allows the 'same-ish' (child.Parent, aka table.WorkSheet) code to iterate
var table = package.Workbook.Worksheets.SelectMany(x => x.Tables).FirstOrDefault();
if (table != null)
{
for (int rowIndex = table.Address.Start.Row; rowIndex <= table.Address.End.Row; rowIndex++)
{
var row = table.WorkSheet.Row(rowIndex);
var rowCells = table.WorkSheet.Cells[$"{rowIndex}:{rowIndex}"];
var rowCellsManyText = string.Join(", ", rowCells.Select(x => x.Text));
for (int columnIndex = table.Address.Start.Column; columnIndex <= table.Address.End.Column; columnIndex++)
{
var currentRowColumn = table.WorkSheet.Cells[rowIndex, columnIndex];
var currentRowColumnText = currentRowColumn.Text;
}
}
}
}
Essentially everything works and operates the same way, you just have to go after child.Parent, AKA table.WorkSheet to get at the same stuff. As others have mentioned, extension methods and possibly even wrapper class(es) could get you more granularity based on the specifics of your business needs but that was not the purpose of this question.
In regards to the indexing comments and responses, I'd advise sticking with the 'Row' and 'Column' properties, first, last, for, foreach etc. instead of hard-coding index vs non-indexed base attributes, i had no issue here at least w/ the new version.
Microsoft.Office.Interop.Excel
classes. They support reading the Table objects. You just need to know that tables in excel sheets are calledListObjects
in the API. – Taejon