Bulk update MySql with python
Asked Answered
C

2

7

I have to update millions of row into MySQL. I am currently using for loop to execute query. To make the update faster I want to use executemany() of Python MySQL Connector, so that I can update in batches using single query for each batch.

Crystallography answered 3/5, 2016 at 13:53 Comment(3)
Nice .. but I see no question ?Trichromat
Basically I want to use executemany for update in python. Is it possible ?Crystallography
What version of MySQL Connector are you using?Darra
G
9

I don't think mysqldb has a way of handling multiple UPDATE queries at one time.

But you can use an INSERT query with ON DUPLICATE KEY UPDATE condition at the end.

I written the following example for ease of use and readability.

import MySQLdb

def update_many(data_list=None, mysql_table=None):
    """
    Updates a mysql table with the data provided. If the key is not unique, the
    data will be inserted into the table.

    The dictionaries must have all the same keys due to how the query is built.

    Param:
        data_list (List):
            A list of dictionaries where the keys are the mysql table
            column names, and the values are the update values
        mysql_table (String):
            The mysql table to be updated.
    """

    # Connection and Cursor
    conn = MySQLdb.connect('localhost', 'jeff', 'atwood', 'stackoverflow')
    cur = conn.cursor()

    query = ""
    values = []

    for data_dict in data_list:

        if not query:
            columns = ', '.join('`{0}`'.format(k) for k in data_dict)
            duplicates = ', '.join('{0}=VALUES({0})'.format(k) for k in data_dict)
            place_holders = ', '.join('%s'.format(k) for k in data_dict)
            query = "INSERT INTO {0} ({1}) VALUES ({2})".format(mysql_table, columns, place_holders)
            query = "{0} ON DUPLICATE KEY UPDATE {1}".format(query, duplicates)

        v = data_dict.values()
        values.append(v)

    try:
        cur.executemany(query, values)
    except MySQLdb.Error, e:
        try:
            print"MySQL Error [%d]: %s" % (e.args[0], e.args[1])
        except IndexError:
            print "MySQL Error: %s" % str(e)

        conn.rollback()
        return False

    conn.commit()
    cur.close()
    conn.close()

Explanation of one liners

columns = ', '.join('`{}`'.format(k) for k in data_dict)

is the same as

column_list = []
for k in data_dict:
    column_list.append(k)
columns = ", ".join(columns)

Here's an example of usage

test_data_list = []
test_data_list.append( {'id' : 1, 'name' : 'Marco', 'articles' : 1 } )
test_data_list.append( {'id' : 2, 'name' : 'Keshaw', 'articles' : 8 } )
test_data_list.append( {'id' : 3, 'name' : 'Wes', 'articles' : 0 } )

update_many(data_list=test_data_list, mysql_table='writers')

Query output

INSERT INTO writers (`articles`, `id`, `name`) VALUES (%s, %s, %s) ON DUPLICATE KEY UPDATE articles=VALUES(articles), id=VALUES(id), name=VALUES(name)

Values output

[[1, 1, 'Marco'], [8, 2, 'Keshaw'], [0, 3, 'Wes']]
Gaye answered 17/7, 2017 at 20:2 Comment(1)
This was useful to me. I found that with Python 3.8, I needed to change values.append(v) to list(values.append(v)). I also found that I needed to update all of the fields (columns) in the record. I don't know if that was because of constraints in the database. However, I got a cryptic message that said something about programmer error. So, I went ahead and pulled all the elements in the record and put them all out when I did the INSERT. I used mysql.connector instead of MySQLdb. I was able to do update all of a small table (50,000) records in less than a minute. Now on to bigger tables.Rhomb
H
2

Maybe this can help How to update multiple rows with single MySQL query in python?

cur.executemany("UPDATE Writers SET Name = %s WHERE Id = %s ",
    [("new_value" , "3"),("new_value" , "6")])
conn.commit()
Hamann answered 3/5, 2016 at 15:50 Comment(4)
No Its not working. It works for only insert not for updateCrystallography
It's working for me with mysql-connector-python-rf==2.1.3. However, executemany for UPDATEs works similar as update one by one in a loop. (It's optimized for INSERTs only).Darra
Yes, while this works for me, it seems extremely slow.Monophonic
The answer by @Westly above shows how to do the UPDATES as INSERTS and helped me with my many UPDATES (I did 5000) at a time.Rhomb

© 2022 - 2024 — McMap. All rights reserved.