Convert scientific notation to float when using OpenRowSet to import a .CSV file
Asked Answered
A

3

9

I am using openrowset to import a csv file into SQL Server. One of the columns in the csv file contains numbers in scientific notation (1.08E+05) and the column in the table it is being inserted

By default it is importing the value as 1 and ignoring the .08E+05.

I have tried using cast() and convert() to convert the value directly when the query is executed as well as setting up the datatype in the table as a character string and importing it as such. All of these methods have the same behavior where the .08E+05 is ignored.

Is there a way to have the value imported as 108000 instead of 1 without the .08E+05 without having to change the csv file itself?

Setting up the datatype as a varchar and reading in the csv file appears to have the same effect with the following code:

CREATE TABLE #dataTemp (StartDate datetime, Value varchar(12))

SET @insertDataQuery = 'SELECT Date, CSVValue from OpenRowset(''MSDASQL'', ''Driver={Microsoft Text Driver (*.txt; *.csv)}; DefaultDir=' 
SET @insertDataQuery = @insertDataQuery + 'C:\Data\;'',''SELECT * FROM '+ '11091800.csv' + ''')'

INSERT INTO #dataTemp EXEC(@insertDataQuery)

SELECT * FROM #dataTemp

Not all of the values in the CSV file have the scientific notation and the value without it, e.g. 81000 come across without issue.

Alcala answered 19/9, 2011 at 15:9 Comment(0)
K
16

For BULK INSERT methodologies I've often found it simpler to first move the data into a table of all varchars, then get rid of extraneous things like quoted delimiters and fix formatting. I remember having a heck of a time getting rid of the scientific notation, you can just play with the varchar table until you get it right. I remember attempting all kinds of precision/scale combinations until I finally found one that was compatible. I think for me it was FLOAT then DECIMAL(24,12)...

SELECT CONVERT(DECIMAL(24, 12), CONVERT(FLOAT, '1.08E+05'));

EDIT adding what I did to try to repro and/or demonstrate a less convoluted way.

I created a very simple CSV file:

StartDate,Value
20110808,81000
20110808,1.08E+05

Then I ran the following code (for some reason I can't get MSDASQL to run on my machine to save my life):

CREATE TABLE #dataTemp(StartDate DATETIME, Value VARCHAR(32));

BULK INSERT #dataTemp FROM 'C:\data\whatever.csv' 
    WITH (ROWTERMINATOR='\n', FIELDTERMINATOR=',', FIRSTROW = 2);

SELECT * FROM #dataTemp
GO
SELECT StartDate, CONVERT(INT, CONVERT(FLOAT, Value)) FROM #dataTemp;
GO
DROP TABLE #dataTemp;

Results:

StartDate               Value
----------------------- --------
2011-08-08 00:00:00.000 81000
2011-08-08 00:00:00.000 1.08E+05

StartDate               (No column name)
----------------------- ----------------
2011-08-08 00:00:00.000 81000
2011-08-08 00:00:00.000 108000
Keiko answered 19/9, 2011 at 15:17 Comment(6)
Actually in your case now that I'm re-reading the question you can probably just use SELECT CONVERT(INT, CONVERT(FLOAT, '1.08E+05')); - the last time I worked on scientific notation I was dealing with performance counters from LogMan, and I definitely needed the decimal places...Keiko
Any of the above solutions work if I supply the value explicitly. In the CSV file I have a column name for the value and if I use that, for example CONVERT(INT, CONVERT(FLOAT, COLUMN_NAME_IN_CSV)) I still get the behavior where it only reads the first digit. Thoughts?Alcala
My thoughts are still to bulk insert into a table of varchars first. There may be something else going on when your insert ... select from openrowset query is trying to match table types to metadata from the query. If you're inserting into varchar first, it's not going to care...Keiko
That makes sense however setting the datatype up as a varchar gives me the same behavior. I added an example of what I have in the query above.Alcala
Please try the method in my updated answer. If you are still getting conversion errors you're going to have to try to narrow down what values are causing the issue - you may have data in that column that can't be converted the way you might expect.Keiko
Using the bulk insert works, I'll adjust the query to use that instead as it is much easier than converting all of the files or changing the way they are generated. Thanks for the help.Alcala
D
5

Will casting it as a real work?

select cast('1.08E+05' as real)
Dramamine answered 19/9, 2011 at 15:17 Comment(0)
P
5

First of all, the fact you have a scientific notation means its likely Excel or some other program that created the value has LOST some data....in other words, the original number inside the notation was converted and so some numbers and accuracy was lost. thats a problem with many Microsoft products that convert from Excel and CSV.

Second, here is a better converting piefce that converts the number to a string:

CONVERT(nvarchar(255),LTRIM(RTRIM(str(ISNULL(YOUR_NUMBER,0),20,0))))
Privy answered 15/12, 2011 at 23:19 Comment(1)
.....has LOST some data......thats a problem with many Microsoft products that convert from Excel and CSV..... exactly. Plus one. Not having Excel installed is pretty much a prerequisite for any kind of data work. Problem is getting your customers to uninstall Excel is quite difficult :)Dockage

© 2022 - 2024 — McMap. All rights reserved.