ado net - import excel data - missing data
Asked Answered
G

3

2

At first glance I have the same problem as many persons had before and I found many questions and answers about my problem but none of them helped me.

I perform importing from MS excel file (file XLS) in NET by using ADO NET. The file contains mixed types in the same column: numbers and text, and the well known problem occurs - text format is not recognized and the data are lost.

I use the following connection string with recommended parameters:

string strConnectionString = string.Format("Provider=Microsoft.Jet.OLEDB.4.0;Data Source={0};Extended Properties=\"Excel 8.0;HDR=NO;IMEX=1;TypeGuessRows=0;ImportMixedTypes=Text\";", pathname);

This is my code:

OleDbConnection sqlConn = new OleDbConnection(this.strConnectionString);

sqlConn.Open();

OleDbDataAdapter sqlAdapter = new OleDbDataAdapter();

string sql = "SELECT * FROM [" + sheetName + "]";

OleDbCommand selectCMD = new OleDbCommand(sql, sqlConn);

sqlAdapter.SelectCommand = selectCMD;

DataTable dt = new DataTable(sheetName);

sqlAdapter.Fill(dt);

I tested it under NET 3.5 (x86) and NET 4.0 (x86) (also tested as windows exe and asp net version), and the problem remains.

I don't know if I do something wrong but I've spent many hours and problem still remains.

Goeselt answered 15/2, 2014 at 23:46 Comment(2)
Have a look at This. I know its a bit tedious but it will get the job done.Newmint
Thanks @Rohan. I saw this answer and it looks like I have to manually add first text row under Excel and then I can read by ADO NET (and delete the extra added row). I want to avoid changing existing excel file because it is an application for a user not for me. It would be acceptable by me if it was possible to add the first text row temporary by ADO NET, then save excel file, and finally read this changed excel file.Goeselt
G
2

Recently I found open source (The MIT License) NET library, which reads xls and xlsx files correctly. In this case I stop to use oledb drivers.

The library: link to project

Goeselt answered 18/4, 2014 at 20:33 Comment(0)
G
1

You need to add some keys to the Windows Registry, they will force Excel engine to treat al columns as text.

Save the following code in a ForceExcelImportAsText.reg file, and then double click it, to add the keys to the Windows registry.

Windows Registry Editor Version 5.00

[HKEY_LOCAL_MACHINE\SOFTWARE\Wow6432Node\Microsoft\Jet\4.0\Engines\Excel]
"TypeGuessRows"=dword:00000000


[HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Jet\4.0\Engines\Excel]
"TypeGuessRows"=dword:00000000

Also, remember to keep Excel CLOSED when you are importing your files.

I never understood why, but if Excel is opened and/or if the file I'm importing is opened, then Excel engine mess up the numeric/data columns, and then all you'll get are meaningless numbers instead of your data.

About the connection string
The connection string I use is

Provider=Microsoft.Jet.OLEDB.4.0;Data Source=C:\YourExcelFile.xls;Extended Properties="Excel 8.0;HDR=YES;IMEX=1"

The parameter "Excel 8.0" is Ok for .xls files up to Excel 2003.
For Excel 2007-2013 .xlsx files I use "Excel 12.0 Xml"
For Excel 2007-2013 .xlsb files I use "Excel 12.0"
For Excel 2007-2013 .xlsm files I use "Excel 12.0 Macro"

Also, the data provider "Microsoft.Jet.OLEDB.4.0" is ok up to .xls Excel 2003 files.
For Excel 2007-2013 xlsx/xlsb/xlsm files you need to use the "Microsoft.ACE.OLEDB.12.0" data provider.

Gallager answered 18/4, 2014 at 13:44 Comment(3)
I use it in web application so changing the registry is impossible because it is hosted on third party machine (server hosting).Goeselt
You should probably include "IMEX=1" in that connection string.Kaylee
@Kaylee Thanks, I fixed it (I swear it was there, on the original connection string I used as a template... but it went lost somewhere in the process :-))Gallager
T
0

If the original goal is to import data and not to use ado.net at any costs then this is the solution that works for us:

  1. If there is Excel installed on the machine doing the import we run small invisible automation script by which Excel converts the original data file into something readable. In our case into the xml. This is how the Xls2Xml.vbs file looks like ' VB Script Document if WScript.Arguments.Count < 2 Then WScript.Echo "Error! Please specify the source path and the destination. Usage: Xls2Xml SourcePath.xls Destination.xml" Wscript.Quit End If Dim oExcel Set oExcel = CreateObject("Excel.Application") Dim oBook Set oBook = oExcel.Workbooks.Open(Wscript.Arguments.Item(0)) oBook.SaveAs WScript.Arguments.Item(1), 46 oBook.Close False oExcel.Quit Wscript.Quit The 46 magic number means xlXMLSpreadsheet. When the script finishes we then import the readable XML format (usually after processing it through a customer-specific XSLT script)

  2. If there will be no Excel automation server going to be available then we use XLSX as the data format and the cool open source ClosedXML library to read the data. Actually the library is better at creating XLSX files but it is not required in this question

Microsoft itself does not recommend access to legacy Excel proprietary format using legacy database drivers, mainly because of those many issues (e.g. the cell data format ambiguity). And any legacy solutions that rely on Excel automation are not guaranteed to work in the on-server scenarios.

In order to use the OpenXML SDK in production you just need to deploy single DocumentFormat.OpenXml.dll and you're ready to read/write XLSX data files

Tauto answered 18/4, 2014 at 18:51 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.