Trying to do LOAD DATA INFILE with REPLACE and AUTO_INCREMENT
Asked Answered
P

2

19

I am trying to load a file onto a MySQL database, having the primary key auto_incremented and I would like the data to be updated if i find any duplicate rows. However, the REPLACE keywords only works on primary key, which is auto generated so i'm stuck.

how to be able to have a table with an ID that auto_increments and at the same time to be able to insert/update data from a file using LOAD DATA INFILE?

Here is the table

CREATE TABLE  `oxygen_domain`.`TEST` (
`TEST_ID` int(11) NOT NULL AUTO_INCREMENT,
`NAME` varchar(255) NOT NULL,
`VALUE` varchar(255) DEFAULT NULL,
PRIMARY KEY (`TEST_ID`,`NAME`,`VALUE`)
) 

and here is the command

LOAD DATA LOCAL INFILE 'C:/testData.txt'
REPLACE
INTO TABLE TEST
FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '"'
LINES TERMINATED BY '\r\n'
IGNORE 1 LINES
(NAME, VALUE);

and here is the sample data

ignored name, ignored value
name1,value1
name2,value2
name3,value3

The wanted ending result after running the command above multiple times withthe above data is

|TEST_ID |NAME |VALUE|
1, 'name1', 'value1'
2, 'name2', 'value2'
3, 'name3', 'value3'
Postmillennialism answered 29/5, 2012 at 14:35 Comment(0)
P
24

OBSERVATION #1

You should not do REPLACE because it is a mechanical DELETE and INSERT.

As the MySQL Documentation says about REPLACE

Paragraph 2

REPLACE is a MySQL extension to the SQL standard. It either inserts, or deletes and inserts. For another MySQL extension to standard SQL—that either inserts or updates—see Section 13.2.5.3, “INSERT ... ON DUPLICATE KEY UPDATE Syntax”.

Paragraph 5

To use REPLACE, you must have both the INSERT and DELETE privileges for the table.

Using REPLACE will throw away established values for TEST_ID that cannot automatically be reused.

OBSERVATION #2

The table layout will not support trapping of duplicate keys

If a name is unique, the table should be laid out like this

LAYOUT #1

CREATE TABLE  `oxygen_domain`.`TEST` (
`TEST_ID` int(11) NOT NULL AUTO_INCREMENT,
`NAME` varchar(255) NOT NULL,
`VALUE` varchar(255) DEFAULT NULL,
PRIMARY KEY (`TEST_ID`),
KEY (`NAME`)
) 

If a name allows multiple values, the table should be laid out like this

LAYOUT #2

CREATE TABLE  `oxygen_domain`.`TEST` (
`TEST_ID` int(11) NOT NULL AUTO_INCREMENT,
`NAME` varchar(255) NOT NULL,
`VALUE` varchar(255) DEFAULT NULL,
PRIMARY KEY (`TEST_ID`),
KEY (`NAME`,`VALUE`)
) 

PROPOSED SOLUTION

Use a temp table to catch everything. Then, perform a big INSERT from the temp table based on layout

LAYOUT #1

Replace the VALUE for a Duplicate NAME

USE oxygen_domain
DROP TABLE IF EXISTS `TESTLOAD`;

CREATE TABLE `TESTLOAD` SELECT NAME,VALUE FROM TEST WHERE 1=2;

LOAD DATA LOCAL INFILE 'C:/testData.txt'
INTO TABLE `TESTLOAD`
FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '"'
LINES TERMINATED BY '\r\n'
IGNORE 1 LINES
(NAME, VALUE);

INSERT INTO `TEST` (NAME, VALUE)
SELECT NAME, VALUE FROM `TESTLOAD`
ON DUPLICATE KEY UPDATE VALUE = VALUES(VALUE);

DROP TABLE `TESTLOAD`;

LAYOUT #2

Ignore Duplicate (NAME,VALUE) rows

USE oxygen_domain
DROP TABLE IF EXISTS `TESTLOAD`;

CREATE TABLE `TESTLOAD` SELECT NAME,VALUE FROM TEST WHERE 1=2;

LOAD DATA LOCAL INFILE 'C:/testData.txt'
INTO TABLE `TESTLOAD`
FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '"'
LINES TERMINATED BY '\r\n'
IGNORE 1 LINES
(NAME, VALUE);

INSERT IGNORE INTO `TEST` (NAME, VALUE)
SELECT NAME, VALUE FROM `TESTLOAD`;

DROP TABLE `TESTLOAD`;

Update

if we need to avoid the creating and dropping of the table each time. we can TRUNCATE TRUNCATE the table before or after using INSERT...INTO statement. Therefore, we do not have to create the table next time.

Pelagianism answered 10/9, 2014 at 17:29 Comment(2)
Yeah I found an answer about 10 minutes after my bounty.. Thanks for a better answer.Wakerife
Worked very wellWatthour
L
0

Create unique index on NAME & VALUE and use IGNORE instead of REPLACE:

LOAD DATA LOCAL INFILE 'C:/testData.txt'
IGNORE
INTO TABLE `TEST`
FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '"'
LINES TERMINATED BY '\r\n'
IGNORE 1 LINES
(NAME, VALUE);
Lamdin answered 17/9, 2014 at 0:14 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.