Bulk upsert with Django
Asked Answered
I

3

10

bulk_create with ignore_conflicts=True insert new records only and doesn't update existing one. bulk_update updates only existing records and doesn't insert new one.

Now I see only one variant to make upsert is a raw query:

from catalog.models import Product

with connection.cursor() as cursor:
    cursor.executemany(
        """
        INSERT INTO app_table (pk, col1, col2) 
        VALUES (%s, %s, %s)
        ON DUPLICATE KEY UPDATE 
            col1 = VALUES(col1), 
            col2 = VALUES(col2); 
        """,
        [
            ('1', 'val1', 'val2'),
            ('2', 'val1', 'val2'),
        ]
    )

Is there another way to perform bulk upsert in Django?

Intercurrent answered 21/11, 2019 at 18:7 Comment(2)
Surely you run bulk_create followed by bulk_update?Roccoroch
@Roccoroch Anyway bulk performs more than one query if amount of data more then bulk_size, so it's a good decisionIntercurrent
P
1

There isn't a traditional method to use create and update in a single query in MySQL. I see that you've used INSERT ... ON DUPLICATE KEY UPDATE which is a common approach in these cases. An alternative lengthy approach:

  • Update a portion of your data in a temporary table, essentially COPY to your temp table -> cursor.copy_from(), then UPDATE to your original table.

Also, you could optimize performance by increasing your cache size, and setting DEBUG=False if you don't need metadeta in your logs Ref.

Perhaps also test the same data using PostgreSQL, and check if it results in better performance.

Pushcart answered 21/11, 2019 at 19:19 Comment(0)
G
5

As of Django 4.1, the bulk_create method supports upserts via update_conflicts:

MyModel.objects.bulk_create(
    queryset, 
    update_conflicts=True, 
    update_fields=['col1', 'col2'],
    unique_fields=['col3'],
)
Gwin answered 25/10, 2022 at 6:8 Comment(0)
P
1

There isn't a traditional method to use create and update in a single query in MySQL. I see that you've used INSERT ... ON DUPLICATE KEY UPDATE which is a common approach in these cases. An alternative lengthy approach:

  • Update a portion of your data in a temporary table, essentially COPY to your temp table -> cursor.copy_from(), then UPDATE to your original table.

Also, you could optimize performance by increasing your cache size, and setting DEBUG=False if you don't need metadeta in your logs Ref.

Perhaps also test the same data using PostgreSQL, and check if it results in better performance.

Pushcart answered 21/11, 2019 at 19:19 Comment(0)
D
0

Mysql have REPLACE INTO function, very useful for replacing data and avoid constraints violation https://dev.mysql.com/doc/refman/8.0/en/replace.html

Domingodominguez answered 9/5, 2024 at 21:46 Comment(1)
As it’s currently written, your answer is unclear. Please edit to add additional details that will help others understand how this addresses the question asked. You can find more information on how to write good answers in the help center.Elah

© 2022 - 2025 — McMap. All rights reserved.