I need a workaround for Excel Guessing Data Types problem
Asked Answered
F

4

7

I'm creating a utility to import data from Excel to Oracle database,

I have a fixed template for the excel file,

Now, when I'm trying to import the data by Jet provider and ADO.Net - Ole connection tools, I found the following problem: there're some columns haven't been imported because there are mixed data types in their columns [string and number],

I looked for this problem on the internet I found the reason is guessing data types from Excel

The load code:

connection = new OleDbConnection(@"Provider=Microsoft.Jet.OLEDB.4.0; Data Source={0};Extended Properties=Excel 8.0;");
string columns = "P_ID, FULL_NAME_AR, job_no, GENDER, BIRTH_DATE, RELIGION, MARITAL_STATUS, NAT_ID, JOB_Name, FIRST_HIRE_DATE, HIRE_DATE, CONTRACT_TYPE, GRADE_CODE, QUALIFICATION";
string sheetName = "[Emps$]";
OleDbCommand command = new OleDbCommand(string.Format("select {0} from {1} where p_id is not null", columns, sheetName), connection);

connection.Open();
dr = command.ExecuteReader();
DataTable table = new DataTable();
table.Load(dr);

What should I do to tell Excel STOP GUESSING and give me the data as Text ?

if there isn't, can you help me with any workarounds ?

Thanks in advance

Flotation answered 29/8, 2010 at 14:34 Comment(5)
How are you importing the records from excel to oracle? code please? where does it fail?Aphid
Can you not just set the entire column as text in Excel?Hyonhyoscine
@ shahkalpesh: It doesn't throw any exception, it just ignores the text data because it guesses the data type for the column is number from the first 8 records and returns null for the non numeric valuesFlotation
Thanks. Can you post some code that you use to read data from excel?Aphid
The usual workaround for this is to create the empty destination table and import into that, instead of creating the table each time you run the import. That way you have full control over the destination data types.Blowhole
F
8

I found a solution by adding IMEX=1 for the connection string, but there's a special format for it which descriped in the following link.

The IMEX parameter is for columns that use mixed numeric and alpha values. The Excel driver will typically scan the first several rows in order to determine what data type to use for each column. If a column is determined to be numeric based upon a scan of the first several rows, then any rows with alpha characters in this column will be returned as Null. The IMEX parameter (1 is input mode) forces the data type of the column to text so that alphanumeric values are handled properly.

Regards

Flotation answered 30/8, 2010 at 12:18 Comment(0)
C
2

This isn't completely right! Apparently, Jet/ACE ALWAYS assumes a string type if the first 8 rows are blank, regardless of IMEX=1, and always uses a numeric type if the first 8 rows are numbers (again, regardless of IMEX=1). Even when I made the rows read to 0 in the registry, I still had the same problem. This was the only sure fire way to get it to work:

try
{
    Console.Write(wsReader.GetDouble(j).ToString());
}
catch   //Lame unfixable bug
{
    Console.Write(wsReader.GetString(j));
}
Covenantor answered 16/7, 2012 at 19:55 Comment(3)
It all depend on whether you have tampered with the registry or not, TypeGuessRows in particular msdn.microsoft.com/en-us/library/bb177610(v=office.12).aspxHockey
try using wsReader.GetValue(j).ToString(), and then if you want it as a double you can try to parse it from the string.Bethesda
The problem with this, Jim, is that if your first values in the column are numbers but you later have values like "CDF62738", those values will be taken as NULL becouse invalid number :/Sibyl
H
1

Can you work from the excel end? This example run in Excel will put mixed data tyoes into an SQL Server table:

Dim cn As New ADODB.Connection

scn = "Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" _
& sFullName _
& ";Extended Properties=""Excel 8.0;HDR=Yes;IMEX=1"";"

cn.Open scn

s = "SELECT Col1, Col2, Col3 INTO [ODBC;Description=TEST;DRIVER=SQL Server;" _
& "SERVER=Some\Instance;Trusted_Connection=Yes;" _
& "DATABASE=test].TableZ  FROM [Sheet1$]"
cn.Execute s
Hockey answered 29/8, 2010 at 14:41 Comment(4)
Thanks Remou, but it doesn't suitable for my case, but my question is, why I can't use IMEX=1 in ADO.Net ? and is there a similar extend property works with ADO.Net?Flotation
Why can you not use IMEX=1? It is just part of the connection string. In this pcreview.co.uk/forums/thread-1863969.php, Paul, MVP, suggest that you can.Hockey
@ Remou: Thank you very much, it works, it was need (char)34 in the connection string (from the link which you added).Flotation
@Flotation Note that the mix must start before TypeGuessRows value in the registry key: Hkey_Local_Machine/Software/Microsoft/Jet/4.0/Engines/Excel/ unless you want to set TypeGuessRows to 0, which means all rows will be scanned.Hockey
V
1

An alternative solution is to add or change the setting TypeGuessRows in the registry. By setting its value to 0, the complete document will be scanned.

Unfortunately, the settings may be found on various locations in the registry, depending on the which libraries and versions of them you have installed.

For instance: [HKEY_LOCAL_MACHINE\SOFTWARE\Wow6432Node\Microsoft\Jet\4.0\Engines\Excel] "TypeGuessRows"=dword:00000000

[HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Office\14.0\Access Connectivity Engine\Engines\Excel] "TypeGuessRows"=dword:00000000

This will also prevent truncation of textual data longer than 255 characters. This happens if you have a number for TypeGuessRows larger than 0 and the first text longer than 255 characters occurs beyond that number.

See also Setting TypeGuessRows for excel ACE Driver.

Vicenta answered 7/12, 2018 at 16:28 Comment(0)

© 2022 - 2025 — McMap. All rights reserved.