How to Import Multiple csv files into a MySQL Database
Asked Answered
B

10

17

Is there a way to import multiple csv files at the same time into a MySQL database? Some sort of batch import?

I'm on Mac OSX running a MAMP server.

I have 185 csv files that I need to import into a MySQL table. I can import them individually using phpMyAdmin's import tab, but it would take a long time. Does anyone know if there is a better way?

Beekeeping answered 16/12, 2011 at 19:11 Comment(0)
W
16

Use a shell script like this:

#!/usr/bin/env bash
cd yourdirectory
for f in *.csv
do
        mysql -e "USE yourDatabase LOAD DATA LOCAL INFILE '"$f"'INTO TABLE yourtable"
done
Woodbridge answered 16/12, 2011 at 19:25 Comment(3)
Hi nazar554, Thank you for your suggestion. Unfortunately, I'm getting the "-bash: mysql: command not found" error message. I'm not sure what I'm doing wrong.Beekeeping
Also, you're invoking mysql once for every file, and that's quite wasteful).Glasshouse
How would you do this recursively (with sub directories)Iasis
A
12

There's a little PHP script for you:

#!/usr/bin/php
<?
mysql_connect('localhost','root','root'); // MAMP defaults
mysql_select_db('yourdatabase');
$files = glob('*.csv');
foreach($files as $file){
    mysql_query("LOAD DATA INFILE '".$file."' INTO TABLE yourtable");
}

See the MySQL Manual for LOAD DATA INFILE options which fit your documents.

Atronna answered 16/12, 2011 at 19:19 Comment(1)
Can this be used to allow first row of data to be used as the column names?Trounce
P
6

You could use a shell script to loop through the files (this one assumes they're in the current directory):

#!/bin/bash

for f in *.csv
do
    mysql -e "load data infile '"$f"' into table my_table" -u username --password=your_password my_database
done
Pleura answered 16/12, 2011 at 19:29 Comment(2)
Hi Tom H, Thank you for the message. I don't know anything about using bash. I'm getting this error in Terminal: "-bash: : command not found". Any idea as to what I'm doing wrong? Thanks.Beekeeping
Replace the shebang with the one from nazar554's script (#!/usr/bin/env bash). After that you'll get the same error your getting from his (mysql: command not found) since we both happened to give you the same answer. You need to run this script on the same machine where your database is hosted (or use the -h switch to connect remotely), and it must have the mysql command line client installed.Pleura
M
4

I've modified Tom's script to solve few issues that faced

#!/bin/bash

for f in *.csv
do
    mysql -e "load data local infile '"$f"' into table myTable fields TERMINATED BY ',' LINES TERMINATED BY '\n'"  -u myUser--password=myPassword fmeter --local-infile
done
  1. load data local infile instead of load data infile : [file to be loaded was local to mysql server]
  2. Added delimiter switches to match my data.
  3. --local-infile to enabled local data load mode on client.
Mariannmarianna answered 10/8, 2013 at 16:14 Comment(1)
This just throws an error: Syntax error: word unexpected (expecting "do")Masurium
T
3

For windows User use this batch

echo off
setlocal enabledelayedexpansion
FOR %%f IN ("*.csv") DO (
  set old=%%~dpnxf
  set new=!old:\=\\!
  mysql -e "load data local infile '"!new!"' IGNORE into table email_us.business  COLUMNS TERMINATED BY ','" -u root
  echo %%~nxf DONE
)
  • email_us -> DB
  • business -> Table
  • IGNORE -> Ignore duplicate insert and on error keep continue
  • ~dpnxf -> d for drive letter, p for path to file, n for filename, x for extension and f is file variable

Steps: - Put that batch file in directory where all multiple csv files exist and named it as something.bat - run cmd.exe as adminstrator and call that something.bat file and enjoy importing...

Terminator answered 11/4, 2015 at 10:53 Comment(0)
N
1

In python you can use d6tstack which makes this simple

import d6tstack
import glob

c = d6tstack.combine_csv.CombinerCSV(glob.glob('*.csv'))
c.to_mysql_combine('mysql+mysqlconnector://usr:pwd@localhost/db', 'tablename')

It also deals with data schema changes, creates table and allows you to preprocess data.

Neisse answered 14/10, 2018 at 23:21 Comment(1)
I found the d6tstack worked well for combining_csvs but that the to_mysql_combine failed because I have an enormous CSV import. to_mysql_combine also required I had to use a pandas to_sql with chunksize set at 200. This approach also supported other MySQL connectors like PyMySQL. Will create another answer with details. c.to_sql(name='inequity', con=engine, if_exists='append', index=False, chunksize=200)Blackthorn
M
1

i had the same task to do with a lot of CSV files and create one table by CSV, so here is my script that i use in local under XAMP.

<?php
ini_set('display_errors',1);
echo '### Begin Importation<br>';

$mysqli  =  new mysqli(
"localhost",
"root",
"",
"mydatabase",
3306
);

if (mysqli_connect_errno()) {
    printf("Connect failed: %s\n", mysqli_connect_error());
    exit();
}

$files = glob('C:\\xampp\\mysql\\data\\mev2\\*.csv');

foreach($files as $file){

    //clean names if needed
    $filename = explode('\\',$file);
    $filename2clean = str_replace('.csv','', $filename[5]);//because my file is under 5 folders on my PC
    $n = strtolower(str_replace('fileprefix_','', filename2clean));

    echo '<br>Create table <b>'.$n.'</b><hr>';

    $sql = "CREATE TABLE IF NOT EXISTS `mydatabase`.`".$n."` (`email` varchar(60), `lastname` varchar(60), `firstname` varchar(60), `country` varchar(19)) DEFAULT CHARACTER SET utf8 COLLATE utf8_general_ci;";

    if (!($stmt = $mysqli->query($sql))) {
        echo "\nQuery execute failed: ERRNO: (" . $mysqli->errno . ") " . $mysqli->error;
    };

    echo '<br>Import data from <b>'.$n.'</b><hr>';

    $sql = "LOAD DATA INFILE '".basename($file)."' INTO TABLE `mydatabase`.`".$n."`  
        FIELDS TERMINATED BY ';'
        LINES TERMINATED BY '\r'  
        IGNORE 1 LINES";

    if (!($stmt = $mysqli->query($sql))) {
        echo "\nQuery execute failed: ERRNO: (" . $mysqli->errno . ") " . $mysqli->error;
    };

}

echo '### Import finished !<br>';
Matthewmatthews answered 27/3, 2019 at 9:43 Comment(1)
great script! Thanks for sharingAxle
J
0

@hlosukwakha you want to use mysqlimport. this searches for a table named like the file. use mysqlimport -help to find the correct parameters, but they're basically identical to mysql

Jansenism answered 8/3, 2015 at 16:12 Comment(0)
W
0

Using following shell script:

for file in /directory/*.csv
do
echo "Importing file $file"
       chown mysql $file
        mysql Fortinet -u user -p'password' <<EOF

LOAD DATA LOCAL INFILE '$file'
IGNORE
INTO TABLE tablename
FIELDS TERMINATED BY ','
ENCLOSED BY '"'
LINES TERMINATED BY '\n'
IGNORE 1 LINES;
EOF

echo "Completed importing '"$file"' "

done
Woodsman answered 30/5, 2019 at 13:40 Comment(0)
B
0

I used Python and d6tstack like @citynorman but because I had 24-million-lines in 200 CSV files, that approach was killing my development database server.

This approach gives you a lot of control and performance in 2 or 3 lines of code. It pulled the 24-million rows into a five-index-column MySQL table, and added data clean-ups, in around 2-minutes. The csv import tool on MySQL Workbench was taking days to do the same thing.

Here's what I made work:

import pandas as pd
import importlib
import d6tstack.combine_csv as d6tc
import d6tstack
import glob
import pymysql  # This approach also supports other MySQL connectors
from sqlalchemy import create_engine
engine = create_engine("mysql+pymysql://usr:pass@host:3306/db")

# For testing just pull in one or two csv files - and then take all
# My data had a ; semicolon separator, so change this to your use case if needed
df = d6tc.CombinerCSV(glob.glob('C:/Users/user/Downloads/csvfiles/*.csv'), sep=';').to_pandas()

# Remove Filepath and Filename 
df.drop(columns=["filepath","filename"],inplace=True, axis=1)

# I created Indexes in my database file during testing, so this line
# makes sure there are no null index values in the CSVs
df = df[df['country'].notna()]

# chunksize throttles your database updates so as not to overwhelm any buffers
# NEVER use "if_exists=replace", unless you want to blank your table 100%
df.to_sql(name='table', con=engine, if_exists='append', index=False, chunksize=200)  
Blackthorn answered 6/3, 2021 at 10:11 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.