How can I do 'insert if not exists' in MySQL?

Asked 1/9, 2009 at 8:56 Answered 13/2 at 7:12

Solved sql mysql sql-insert primary-key unique-constraint

1090

I started by googling and found the article How to write INSERT if NOT EXISTS queries in standard SQL which talks about mutex tables.

I have a table with ~14 million records. If I want to add more data in the same format, is there a way to ensure the record I want to insert does not already exist without using a pair of queries (i.e., one query to check and one to insert is the result set is empty)?

Does a unique constraint on a field guarantee the insert will fail if it's already there?

It seems that with merely a constraint, when I issue the insert via PHP, the script croaks.

Stadiometer answered 1/9, 2009 at 8:56 Comment(0)

1013

Use INSERT IGNORE INTO table.

There's also INSERT … ON DUPLICATE KEY UPDATE syntax, and you can find explanations in 13.2.6.2 INSERT ... ON DUPLICATE KEY UPDATE Statement.

Post from bogdan.org.ua according to Google's webcache:

18th October 2007

To start: as of the latest MySQL, syntax presented in the title is not possible. But there are several very easy ways to accomplish what is expected using existing functionality.

There are 3 possible solutions: using INSERT IGNORE, REPLACE, or INSERT … ON DUPLICATE KEY UPDATE.

Imagine we have a table:
CREATE TABLE `transcripts` (
`ensembl_transcript_id` varchar(20) NOT NULL,
`transcript_chrom_start` int(10) unsigned NOT NULL,
`transcript_chrom_end` int(10) unsigned NOT NULL,
PRIMARY KEY (`ensembl_transcript_id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
Now imagine that we have an automatic pipeline importing transcripts meta-data from Ensembl, and that due to various reasons the pipeline might be broken at any step of execution. Thus, we need to ensure two things:

repeated executions of the pipeline will not destroy our > database

repeated executions will not die due to ‘duplicate > primary key’ errors.

Method 1: using REPLACE

It’s very simple:
REPLACE INTO `transcripts`
SET `ensembl_transcript_id` = 'ENSORGT00000000001',
`transcript_chrom_start` = 12345,
`transcript_chrom_end` = 12678;
If the record exists, it will be overwritten; if it does not yet exist, it will be created. However, using this method isn’t efficient for our case: we do not need to overwrite existing records, it’s fine just to skip them.

Method 2: using INSERT IGNORE Also very simple:
INSERT IGNORE INTO `transcripts`
SET `ensembl_transcript_id` = 'ENSORGT00000000001',
`transcript_chrom_start` = 12345,
`transcript_chrom_end` = 12678;
Here, if the ‘ensembl_transcript_id’ is already present in the database, it will be silently skipped (ignored). (To be more precise, here’s a quote from MySQL reference manual: “If you use the IGNORE keyword, errors that occur while executing the INSERT statement are treated as warnings instead. For example, without IGNORE, a row that duplicates an existing UNIQUE index or PRIMARY KEY value in the table causes a duplicate-key error and the statement is aborted.”.) If the record doesn’t yet exist, it will be created.

This second method has several potential weaknesses, including non-abortion of the query in case any other problem occurs (see the manual). Thus it should be used if previously tested without the IGNORE keyword.

Method 3: using INSERT … ON DUPLICATE KEY UPDATE:

Third option is to use INSERT … ON DUPLICATE KEY UPDATE syntax, and in the UPDATE part just do nothing do some meaningless (empty) operation, like calculating 0+0 (Geoffray suggests doing the id=id assignment for the MySQL optimization engine to ignore this operation). Advantage of this method is that it only ignores duplicate key events, and still aborts on other errors.

As a final notice: this post was inspired by Xaprb. I’d also advise to consult his other post on writing flexible SQL queries.

Aretha answered 1/9, 2009 at 9:2 Comment(19)

and can I combine that with "delayed" to speed the script up? – Stadiometer 1/9, 2009 at 9:18

yes, insert delayed might speed up things for you. try it out – Aretha 1/9, 2009 at 9:27

update - I did an insert ignore delay and the php script finishes much faster by offloading all of the work onto mysql; for my current purposes, this is perfect; thanks – Stadiometer 1/9, 2009 at 14:43

Yes, and keep in mind that REPLACE INTO does DELETE then INSERT, not UPDATE – Behl 30/1, 2011 at 21:30

So, which one is best -- INSERT IGNORE INTO table or INSERT … ON DUPLICATE KEY UPDATE ? – Hinayana 21/4, 2012 at 23:20

@Jimmy: that depends entirely depends on what you want to achieve, they do different things. – Aretha 22/4, 2012 at 7:51

INSERT … ON DUPLICATE KEY UPDATE is better since it does not delete the row, preserving any auto_increment columns and other data. – Zolner 11/12, 2012 at 18:2

Where would you put in the WHERE statement if you want a specific condition to affect both INSERT and UPDATE? – Hero 25/3, 2013 at 20:17

@user1166877: What WHERE condition? There is no where condition in an INSERT statement – Aretha 26/3, 2013 at 15:15

Just to inform everyone. Using INSERT … ON DUPLICATE KEY UPDATE method does increment any AUTO_INCREMENT column with failed insert. Probably because it's not really failed, but UPDATE'd. – Lati 29/10, 2013 at 21:36

@user1147688 the same happens with INSERT IGNORE, this only applies to the InnoDB engine – Joinder 18/12, 2013 at 15:35

Getting odd incremental behaviour using INSERT ON DUPLICATE KEY on InnoDB engine. The auto-increment column increments even on failed inserts. So if you had Keyword1,Keyword2 already stored and then tried to add a new one, Keyword3, Keyword3's ID would actually be incremented by 3, because of the previous duplicate finds/fails. – Darrendarrey 8/8, 2014 at 12:41

Awesome answer, thank you! I will be using the "on duplicate key update" option. – Cantilena 3/1, 2017 at 4:51

All of that works only if the table has a unique index. There may be reasons to insert rows with the same index, and then this query is useless, it still creates new rows. – Hackamore 3/3, 2019 at 15:22

How does this advice hold up today? Is "on duplicate key update" still sound advice? – Arty 16/8, 2019 at 0:45

The INSERT IGNORE INTO tableName works, if the colum you going to affect had a Unique key. That way its now going to insert it into the dbase and going to warn you about it, but will insert the rest. – Velum 19/8, 2020 at 13:33

is there a way to get the id of existing row after calling INSERT ... ON DUPLICATE KEY UPDATE ? I was using $db->lastInsertId() when using auto_increment, but that's no longer valid if I use this instead, right? – Frisbie 3/2, 2021 at 22:13

@Frisbie I'm not aware of an option to combine the two. – Aretha 4/2, 2021 at 19:50

INSERT IGNORE works but note it ignores ALL errors: https://mcmap.net/q/54151/-on-duplicate-key-ignore-duplicate – Liggitt 8/6, 2021 at 13:9

331

Solution:

INSERT INTO `table` (`value1`, `value2`) 
SELECT 'stuff for value1', 'stuff for value2' FROM DUAL 
WHERE NOT EXISTS (SELECT * FROM `table` 
      WHERE `value1`='stuff for value1' AND `value2`='stuff for value2' LIMIT 1)

Explanation:

The innermost query

SELECT * FROM `table` 
      WHERE `value1`='stuff for value1' AND `value2`='stuff for value2' LIMIT 1

used as the WHERE NOT EXISTS-condition detects if there already exists a row with the data to be inserted. After one row of this kind is found, the query may stop, hence the LIMIT 1 (micro-optimization, may be omitted).

The intermediate query

SELECT 'stuff for value1', 'stuff for value2' FROM DUAL

represents the values to be inserted. DUAL refers to a special one row, one column table present by default in all Oracle databases (see https://en.wikipedia.org/wiki/DUAL_table). On a MySQL-Server version 5.7.26 I got a valid query when omitting FROM DUAL, but older versions (like 5.5.60) seem to require the FROM information. By using WHERE NOT EXISTS the intermediate query returns an empty result set if the innermost query found matching data.

The outer query

INSERT INTO `table` (`value1`, `value2`)

inserts the data, if any is returned by the intermediate query.

Pothook answered 11/6, 2010 at 18:38 Comment(11)

can you give some more info on how to use this? – Totemism 15/5, 2012 at 19:2

This variant is suitable if no unique key on table exists (INSERT IGNORE and INSERT ON DUPLICATE KEY require unique key constraints) – Magnifico 4/4, 2013 at 18:29

What if stuff for value1 and stuff for value2 are identical? This would throw a Duplicate column name – Pommard 29/6, 2014 at 8:13

@Pommard They aren't column names, they are strings; the columns are value1 and value2. This data for different columns of course contain dupes. – Crocodilian 5/4, 2016 at 8:46

I also much prefer SELECT 1 instead of SELECT * in the subqueries. Much more likely that this can be satisfied by an index. – Crocodilian 5/4, 2016 at 8:49

@Pommard use "AS v1" to solve "duplicate column name", see #3165005 – Improvvisatore 9/2, 2017 at 10:45

Instead of DUAL you can use (at least in mysql) INSERT INTO table (value1, value2) SELECT 'stuff for value1', 'stuff for value2' FROM (select 1) x WHERE NOT EXISTS (SELECT * FROM table WHERE value1='stuff for value1' AND value2='stuff for value2'); – Zosi 21/3, 2018 at 9:55

Thanks, that's the best solution in this topic, and it works ;-) – Vyatka 2/4, 2019 at 10:45

"FROM" is not required for a SELECT statement, you don't need it if you aren't using any columns from that table "FROM DUAL" is pointless – Gellman 15/5, 2019 at 20:44

Thank you for the explanation and alternative solution to needing to update where a non unique key doesn't exist. Learned something new today! – Hymenopteran 19/7, 2022 at 16:12

Where are the VALUES to be inserted? – Sverre 1/3 at 12:33

In MySQL, ON DUPLICATE KEY UPDATE or INSERT IGNORE can be viable solutions.

An example of ON DUPLICATE KEY UPDATE update based on mysql.com:

INSERT INTO table (a,b,c) VALUES (1,2,3)
  ON DUPLICATE KEY UPDATE c=c+1;

UPDATE table SET c=c+1 WHERE a=1;

An example of INSERT IGNORE based on mysql.com

INSERT [LOW_PRIORITY | DELAYED | HIGH_PRIORITY] [IGNORE]
    [INTO] tbl_name [(col_name,...)]
    {VALUES | VALUE} ({expr | DEFAULT},...),(...),...
    [ ON DUPLICATE KEY UPDATE
      col_name=expr
        [, col_name=expr] ... ]

Or:

INSERT [LOW_PRIORITY | DELAYED | HIGH_PRIORITY] [IGNORE]
    [INTO] tbl_name
    SET col_name={expr | DEFAULT}, ...
    [ ON DUPLICATE KEY UPDATE
      col_name=expr
        [, col_name=expr] ... ]

Or:

INSERT [LOW_PRIORITY | HIGH_PRIORITY] [IGNORE]
    [INTO] tbl_name [(col_name,...)]
    SELECT ...
    [ ON DUPLICATE KEY UPDATE
      col_name=expr
        [, col_name=expr] ... ]

Ordonnance answered 1/9, 2009 at 9:5 Comment(0)

Any simple constraint should do the job, if an exception is acceptable. Examples:

primary key if not surrogate
unique constraint on a column
multi-column unique constraint

Sorry if this seems deceptively simple. I know it looks bad confronted to the link you share with us. ;-(

But I nevertheless give this answer, because it seems to fill your need. (If not, it may trigger you updating your requirements, which would be "a Good Thing"(TM) also).

If an insert would break the database unique constraint, an exception is throw at the database level, relayed by the driver. It will certainly stop your script, with a failure. It must be possible in PHP to address that case...

Heinz answered 1/9, 2009 at 9:1 Comment(4)

i added a clarification to the question - does your answer still apply? – Stadiometer 1/9, 2009 at 9:12

I believe it does. A unique constraint will cause the failure of incorrect inserts. Note : you have to deal with this failure in your code, but this is quite standard. – Heinz 1/9, 2009 at 16:10

for now I'm going to stick with the solution I accepted - but will further look into handling INSERT failures etc as the app grows – Stadiometer 7/9, 2009 at 6:47

INSERT IGNORE basically changes all errors into warnings so that your script is not interrupted. You can then view any warnings with the command SHOW WARNINGS. And another important note: UNIQUE constraints don't work with NULL values, ie. row1 (1, NULL) and row2 (1, NULL) will both get inserted (unless another constraint such as a primary key is broken). Unfortunate. – Overtax 24/8, 2011 at 3:51

Try the following:

IF (SELECT COUNT(*) FROM beta WHERE name = 'John' > 0)
  UPDATE alfa SET c1=(SELECT id FROM beta WHERE name = 'John')
ELSE
BEGIN
  INSERT INTO beta (name) VALUES ('John')
  INSERT INTO alfa (c1) VALUES (LAST_INSERT_ID())
END

Udale answered 10/5, 2013 at 9:58 Comment(2)

Try This answers are low-value on StackOverflow because they do very little to educate the OP and thousands of future researchers. Please edit this answer to include how the solution works and why it is a good idea. – Edvard 1/9, 2018 at 2:58

Perfect solution in case the to-be-matching fields aren't keys ..! – Flagman 3/2, 2020 at 22:19

Here is a PHP function that will insert a row only if all the specified columns values don't already exist in the table.

If one of the columns differ, the row will be added.
If the table is empty, the row will be added.

If a row exists where all the specified columns have the specified values, the row won't be added.

 function insert_unique($table, $vars)
 {
   if (count($vars)) {
     $table = mysql_real_escape_string($table);
     $vars = array_map('mysql_real_escape_string', $vars);

     $req = "INSERT INTO `$table` (`". join('`, `', array_keys($vars)) ."`) ";
     $req .= "SELECT '". join("', '", $vars) ."' FROM DUAL ";
     $req .= "WHERE NOT EXISTS (SELECT 1 FROM `$table` WHERE ";

     foreach ($vars AS $col => $val)
       $req .= "`$col`='$val' AND ";

     $req = substr($req, 0, -5) . ") LIMIT 1";

     $res = mysql_query($req) OR die();
     return mysql_insert_id();
   }
   return False;
 }

Example usage:

<?php
  insert_unique('mytable', array(
    'mycolumn1' => 'myvalue1',
    'mycolumn2' => 'myvalue2',
    'mycolumn3' => 'myvalue3'
    )
  );
?>

Affine answered 6/3, 2012 at 19:56 Comment(4)

Pretty expensive if you have a huge load of insertions. – Commendam 11/3, 2012 at 2:41

true, but efficient if you need to add specific checkups – Ciapas 11/4, 2012 at 19:44

Warning: mysql_* extension is deprecated as of PHP 5.5.0, and has been removed as of PHP 7.0.0. Instead, either the mysqli or PDO_MySQL extension should be used. See also the MySQL API Overview for further help while choosing a MySQL API. – Gratulant 20/3, 2020 at 18:0

What's the point in doing mysql_real_escape_string($table)? – Helsa 10/5, 2022 at 10:9

REPLACE INTO `transcripts`
SET `ensembl_transcript_id` = 'ENSORGT00000000001',
`transcript_chrom_start` = 12345,
`transcript_chrom_end` = 12678;

If the record exists, it will be overwritten; if it does not yet exist, it will be created.

Jalousie answered 6/7, 2012 at 14:35 Comment(3)

REPLACE may delete the row and then insert instead of update. The side effect is that constraints may delete other objects and delete triggers are fired. – Daniel 28/2, 2017 at 21:4

From the MySQL manual: "REPLACE makes sense only if a table has a PRIMARY KEY or UNIQUE index. Otherwise, it becomes equivalent to INSERT, because there is no index to be used to determine whether a new row duplicates another." – Fracture 31/12, 2019 at 9:28

REPLACE INTO is, perhaps, the worst possible answer to this question – Stadiometer 12/9, 2023 at 17:27

There are several answers that cover how to solve this if you have a UNIQUE index that you can check against with ON DUPLICATE KEY or INSERT IGNORE. That is not always the case, and as UNIQUE has a length constraint (1000 bytes) you might not be able to change that. For example, I had to work with metadata in WordPress (wp_postmeta).

I finally solved it with two queries:

UPDATE wp_postmeta SET meta_value = ? WHERE meta_key = ? AND post_id = ?;
INSERT INTO wp_postmeta (post_id, meta_key, meta_value) SELECT DISTINCT ?, ?, ? FROM wp_postmeta WHERE NOT EXISTS(SELECT * FROM wp_postmeta WHERE meta_key = ? AND post_id = ?);

Query 1 is a regular UPDATE query without any effect when the data set in question is not there. Query 2 is an INSERT which depends on a NOT EXISTS, i.e. the INSERT is only executed when the data set doesn't exist.

Sandry answered 18/8, 2016 at 18:38 Comment(1)

This example works is a bit wrong. It will insert the duplicate rows as many there exists in the table meta_key. If adding LIMIT 1 at the end - it will work, but still this feels kid of unsafe/hacky. I prefer the select part after insert write like this: SELECT * FROM (SELECT DISTINCT ?, ?, ?) as tmp - it feels safer in case if forgot to write LIMIT 1. – Rectory 9/2, 2022 at 15:42

Something worth noting is that INSERT IGNORE will still increment the primary key whether the statement was a success or not just like a normal INSERT would.

This will cause gaps in your primary keys that might make a programmer mentally unstable. Or if your application is poorly designed and depends on perfect incremental primary keys, it might become a headache.

Look into innodb_autoinc_lock_mode = 0 (server setting, and comes with a slight performance hit), or use a SELECT first to make sure your query will not fail (which also comes with a performance hit and extra code).

Analyze answered 7/9, 2019 at 20:53 Comment(2)

Why would "gaps in your primary keys" - even potentially - "make a programmer mentally unstable"? Gaps occur all the time in primary keys - every time you delete a record, for example. – Stadiometer 9/9, 2019 at 12:51

Starting with a SELECT defeats the whole purpose of just handing-off a big batch of INSERTs and not wanting to worry about duplicates. – Stadiometer 9/9, 2019 at 12:51

Update or insert without known primary key

If you already have a unique or primary key, the other answers with either INSERT INTO ... ON DUPLICATE KEY UPDATE ... or REPLACE INTO ... should work fine (note that replace into deletes if exists and then inserts - thus does not partially update existing values).

But if you have the values for some_column_id and some_type, the combination of which are known to be unique. And you want to update some_value if exists, or insert if not exists. And you want to do it in just one query (to avoid using a transaction). This might be a solution:

INSERT INTO my_table (id, some_column_id, some_type, some_value)
SELECT t.id, t.some_column_id, t.some_type, t.some_value
FROM (
    SELECT id, some_column_id, some_type, some_value
    FROM my_table
    WHERE some_column_id = ? AND some_type = ?
    UNION ALL
    SELECT s.id, s.some_column_id, s.some_type, s.some_value
    FROM (SELECT NULL AS id, ? AS some_column_id, ? AS some_type, ? AS some_value) AS s
) AS t
LIMIT 1
ON DUPLICATE KEY UPDATE
some_value = ?

Basically, the query executes this way (less complicated than it may look):

Select an existing row via the WHERE clause match.
Union that result with a potential new row (table s), where the column values are explicitly given (s.id is NULL, so it will generate a new auto-increment identifier).
If an existing row is found, then the potential new row from table s is discarded (due to LIMIT 1 on table t), and it will always trigger an ON DUPLICATE KEY which will UPDATE the some_value column.
If an existing row is not found, then the potential new row is inserted (as given by table s).

Note: Every table in a relational database should have at least a primary auto-increment id column. If you don't have this, add it, even when you don't need it at first sight. It is definitely needed for this "trick".

Contaminate answered 6/12, 2019 at 8:26 Comment(2)

Several other answerers have proffered an INSERT INTO ... SELECT FROM format. Why did you also? – Stadiometer 9/12, 2019 at 23:12

@Stadiometer Either you did not read my answer, you do not understand it, or I didn't explain it properly. In any case, let me emphasize the following: this is not just a regular INSERT INTO... SELECT FROM... solution. Please refer to me a link to an answer that is the same, if you can find it I will delete this answer, otherwise you upvote my answer (deal?). Be sure to verify that the answer you're going to link only uses 1 query (for update+insert), no transaction, and is able to target any combination of columns that are known to be unique (so separately the columns don't need to be unique). – Contaminate 11/12, 2019 at 8:51

MySQL does not support the MERGE statement otherwise that would have been the most versatile solution. The INSERT IGNORE hack requires unique key constraints which is not always an option.

A simple solution is to use table value constructor along with not exists. This allows you to insert multiple rows with customized duplicate checking. And there is no repetition in the where clause.

The following is the syntax for MySQL, but the logic works in any RDBMS that supports table value constructors:

insert into tbl (k, a, b)
select k, a, b
from (values
    row(3, 'a3', 'b3'),
    row(4, 'a4', 'b4'),
    row(5, 'a5', 'b5')
) as tvc(k, a, b)
where not exists (
    select 1
    from tbl
    where tbl.k = tvc.k
)

DB<>Fiddle for MySQL, MariaDB, SQL Server

Limon answered 13/2 at 7:12 Comment(6)

how is this better/different from what has already been answered hereon? – Stadiometer 14/2 at 19:31

I didn't see any example of inserting multiple rows or one using tvc. – Limon 14/2 at 19:50

@Stadiometer are you asking for the answer to give explicit, compelling insights regarding why this answer should be used over pre-existing advice? – Edvard 17/2 at 6:38

@Edvard - with a new answer to such a prominent question, yes: that's precisely what I am asking. Why is this answer better-than/adding-to what has already been answered – Stadiometer 19/2 at 17:23

@SalmanA ... the accepted answer covers multiple rows quite handily - https://mcmap.net/q/53067/-how-can-i-do-39-insert-if-not-exists-39-in-mysql :) – Stadiometer 19/2 at 17:24

The accepted answer requires unique keys, not ideal when you have specific or one time need. I'll expand my answer in a while. – Limon 20/2 at 3:40

-2

INSERT INTO table_name (columns) VALUES (values) ON CONFLICT (id) DO NOTHING;

Stateless answered 21/2, 2022 at 17:15 Comment(4)

How does this answer differ from the accepted answer, and all the previously-supplied and -upvoted responses? – Stadiometer 22/2, 2022 at 14:20

@Stadiometer The biggest difference is that this answer won't work in MySQL. It would work in PostgreSQL and potentially other database software that support ON CONFLICT (albeit not in SQLite, which would IGNORE rather than DO NOTHING). – Vallievalliere 1/7, 2022 at 5:25

This works in postgresql, IGNORE did not. – Gaol 21/1, 2023 at 3:27

@Vallievalliere - if it "won't work in MySQL", then it;s doubly-bad in a question about MySQL! – Stadiometer 12/9, 2023 at 12:16

Hot tags

Godot Unity Godot Help Programming Godot 4.X GUI GDScript 3D 2D Physics CSharp Godot 3.X VR XR Projects C++

Update or insert without known primary key

Recommended topics

Hot tags