Import Multiple CSV Files to SQL Server from a Folder
Asked Answered
C

8

25

I have a folder called "Dump." This folder consists of various .CSV Files. The folder Location is 'C:\Dump'

I want to Import the contents of these files into SQL Server. I want the rough code along with proper comments so that I understand it.

I have tried a few codes that I found on the Net. But they haven't quite worked out for me for some strange reason.


The steps I would like to have are

Step 1: Copy all the File Names in the folder to a Table

Step 2: Iterate through the table and copy the data from the files using Bulk Insert.


Someone do please help me out on this one. Thanks a lot in advance :)

Colloquy answered 18/4, 2013 at 6:50 Comment(0)
C
57
    --BULK INSERT MULTIPLE FILES From a Folder 

    --a table to loop thru filenames drop table ALLFILENAMES
    CREATE TABLE ALLFILENAMES(WHICHPATH VARCHAR(255),WHICHFILE varchar(255))

    --some variables
    declare @filename varchar(255),
            @path     varchar(255),
            @sql      varchar(8000),
            @cmd      varchar(1000)


    --get the list of files to process:
    SET @path = 'C:\Dump\'
    SET @cmd = 'dir ' + @path + '*.csv /b'
    INSERT INTO  ALLFILENAMES(WHICHFILE)
    EXEC Master..xp_cmdShell @cmd
    UPDATE ALLFILENAMES SET WHICHPATH = @path where WHICHPATH is null


    --cursor loop
    declare c1 cursor for SELECT WHICHPATH,WHICHFILE FROM ALLFILENAMES where WHICHFILE like '%.csv%'
    open c1
    fetch next from c1 into @path,@filename
    While @@fetch_status <> -1
      begin
      --bulk insert won't take a variable name, so make a sql and execute it instead:
       set @sql = 'BULK INSERT Temp FROM ''' + @path + @filename + ''' '
           + '     WITH ( 
                   FIELDTERMINATOR = '','', 
                   ROWTERMINATOR = ''\n'', 
                   FIRSTROW = 2 
                ) '
    print @sql
    exec (@sql)

      fetch next from c1 into @path,@filename
      end
    close c1
    deallocate c1


    --Extras

    --delete from ALLFILENAMES where WHICHFILE is NULL
    --select * from ALLFILENAMES
    --drop table ALLFILENAMES
Colloquy answered 18/4, 2013 at 8:56 Comment(5)
How did you get around the Invalid object name 'Temp' error when trying to do the Bulk Insert?Avoid
'Temp' has to be the name of the table you're inserting to, it has to exist!Leannaleanne
@Mvision maybe you can help me with this question :) #32863289Systematize
Running this, my ALLFILENAMES table only has 2 records. 'Access is denied.' and NULL. Any idea what may be causing this or how to fix it?Cartogram
Does Temp have to have the same columns as the csv files...and all csv files have to have same table structure? Otherwise how do we pre-create temp?Surat
K
5

This will give you separate tables for each file.

--BULK INSERT MULTIPLE FILES From a Folder 
drop table allfilenames
--a table to loop thru filenames drop table ALLFILENAMES
CREATE TABLE ALLFILENAMES(WHICHPATH VARCHAR(255),WHICHFILE varchar(255))

--some variables
declare @filename varchar(255),
        @path     varchar(255),
        @sql      varchar(8000),
        @cmd      varchar(1000)


--get the list of files to process:
SET @path = 'D:\Benihana\backup_csv_benihana_20191128032207_part_1\'
SET @cmd = 'dir ' + @path + '*.csv /b'
INSERT INTO  ALLFILENAMES(WHICHFILE)
EXEC Master..xp_cmdShell @cmd
UPDATE ALLFILENAMES SET WHICHPATH = @path where WHICHPATH is null

delete from ALLFILENAMES where  WHICHFILE is null
--SELECT replace(whichfile,'.csv',''),* FROM dbo.ALLFILENAMES


--cursor loop
declare c1 cursor for SELECT WHICHPATH,WHICHFILE FROM ALLFILENAMES where WHICHFILE like '%.csv%' order by WHICHFILE desc
open c1
fetch next from c1 into @path,@filename
While @@fetch_status <> -1
  begin
  --bulk insert won't take a variable name, so make a sql and execute it instead:
   set @sql = 

   'select * into '+ Replace(@filename, '.csv','')+'
    from openrowset(''MSDASQL''
    ,''Driver={Microsoft Access Text Driver (*.txt, *.csv)}''
    ,''select * from '+@Path+@filename+''')' 


print @sql
exec (@sql)

  fetch next from c1 into @path,@filename
  end
close c1
deallocate c1
Kirkwood answered 13/5, 2020 at 17:57 Comment(0)
N
1

For Step 1 Maybe you can look at:

http://www.sql-server-performance.com/forum/threads/copying-filenames-to-sql-table.11546/

or

How to list files inside a folder with SQL Server

and then Step 2

How to cast variables in T-SQL for bulk insert?

HTH

Nabob answered 18/4, 2013 at 7:49 Comment(2)
Thanks for the help. but i eventually got a code working for it. :)Colloquy
@Colloquy how did you get this working? Can you provide code?Irresponsive
S
1

You might need to enable the xp_cmdshell first:

sp_configure 'show advanced options', '1'
RECONFIGURE
go
sp_configure 'xp_cmdshell', '1' 
RECONFIGURE
go

And, to enable ad_hoc,

sp_configure 'show advanced options', 1;
RECONFIGURE;
GO
sp_configure 'Ad Hoc Distributed Queries', 1;
RECONFIGURE;
GO
Spile answered 3/11, 2020 at 18:25 Comment(0)
H
0

To solve step 1, xp_dirtree can also be used to list all files and folders.

Keep in mind that it is an undocumented function. Security precautions must be considered. Intentionally crafted filenames could be an intrusion vector.

Himmler answered 11/12, 2018 at 15:19 Comment(0)
W
0

In python you can use d6tstack which makes this simple

import d6tstack
import glob

c = d6tstack.combine_csv.CombinerCSV(glob.glob('*.csv'))
c.to_mssql_combine('mssql+pymssql://usr:pwd@localhost/db', 'tablename')

See SQL examples. It also deals with data schema changes, creates table and allows you to preprocess data. It leverages BULK INSERT so should be just as fast.

Woorali answered 17/12, 2018 at 4:18 Comment(0)
G
0

to expand upon the answer by SarangArd you can replace temp with the following if your file name matches your table name.

' + Left(@filename, Len(@filename)-4) + '
Grangerize answered 22/8, 2019 at 18:57 Comment(0)
S
0

This code will create a new table per CSV file that is imported. Best to populate empty database from CSV files.

CREATE TABLE ALLFILENAMES
(
             WHICHPATH VARCHAR(255)
            ,WHICHFILE VARCHAR(255)
)

DECLARE @filename VARCHAR(255),
        @path     VARCHAR(255),
        @sql      VARCHAR(8000),
        @cmd      VARCHAR(1000)

SET @path = 'L:\DATA\SOURCE\CSV\'  --PATH TO YOUR CSV FILES (CHANGE TO YOUR PATH)
SET @cmd = 'dir ' + @path + '*.csv /b'

INSERT INTO ALLFILENAMES(WHICHFILE)
EXEC Master..xp_cmdShell @cmd

UPDATE ALLFILENAMES
  SET WHICHPATH = @path
WHERE WHICHPATH IS NULL

DECLARE c1 CURSOR
FOR SELECT WHICHPATH
          ,WHICHFILE
    FROM ALLFILENAMES
    WHERE WHICHFILE LIKE '%.csv%'

OPEN c1

FETCH NEXT FROM c1 INTO @path,
                        @filename

WHILE @@fetch_status <> -1
BEGIN
    CREATE TABLE #Header
    (
                 HeadString NVARCHAR(MAX)
    )
    DECLARE @Columns NVARCHAR(MAX) = ''
    DECLARE @Query NVARCHAR(MAX) = ''
    DECLARE @QUERY2 NVARCHAR(MAX) = ''
    DECLARE @HeaderQuery NVARCHAR(MAX) = ''

    SELECT @HeaderQuery = @HeaderQuery + 'bulk insert #Header from ''' + @path + @filename + '''  
        with(firstrow=1,lastrow=1)'
    EXEC (@HeaderQuery)

    SELECT @Columns = (SELECT QUOTENAME(value) + ' nvarchar(max)' + ','
                       FROM #Header
                            CROSS APPLY STRING_SPLIT(HeadString,',') FOR xml PATH(''))
    IF ISNULL(@Columns,'') <> ''
    BEGIN
        SET @Columns = LEFT(@Columns,LEN(@Columns) - 1)

        SELECT @Query = @Query + 'CREATE TABLE ' + Replace(@filename,'.csv','') + ' (' + replace(@Columns,'"','') + ')'
        PRINT @Query
        EXEC (@QUERY)
    END

    SELECT @QUERY2 = @QUERY2 + 'bulk insert ' + replace(Replace(@filename,'.csv',''),'.TPS','') + ' from ''' + @path + @filename + '''  
        with(firstrow=2,FORMAT=''csv'',FIELDTERMINATOR='','',ROWTERMINATOR=''\n'')'
    EXEC (@QUERY2)

    DROP TABLE #Header
    FETCH NEXT FROM c1 INTO @path,
                            @filename
END

CLOSE c1

DEALLOCATE c1
Spile answered 3/11, 2020 at 19:35 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.