Bulk update using Peewee library
Asked Answered
P

2

6

I'm trying to update many records inside a table using Peewee library. Inside a for loop, i fetch a single record and then I update it but this sounds awful in terms of performance so I need to do the update in bulk. Current code look like this:

usernames_to_update = get_target_usernames()
for username in usernames_to_update:
    user = User.get(User.username == username) # username is primary key
    if user.type == 'type_1':
        user.some_attr_1 = some_value_1
    elif user.type == 'type_2':
        user.some_attr_2 = some_value_2
    # elif ....
    user.save()

In the documentation, there is insert_many function but nothing like update_many. Searching around i came up with these solutions:

  1. Executing raw query using CASE: Link
  2. Using replace_many: Link
  3. Using update : Link

But i couldn't find any examples of how to use the second or third solution. Can somebody clarify how cases 2 and 3 can be used?

Pluviometer answered 30/8, 2018 at 11:55 Comment(0)
D
4

You want the .update() method:

query = User.update(validated=True).where(User.username.in_(usernames_to_update))
query.execute()

Edit: so you want to conditionally set the value during an update. You can use the Case helper. Untested:

some_value_1 = 'foo'
some_value_2 = 'bar'
case_stmt = Case(User.type, [
    ('type_1', some_value_1),
    ('type_2', some_value_2)])
query = User.update(some_field=case_stmt).where(User.username.in_(list_of_usernames))
query.execute()

Docs can be found here: http://docs.peewee-orm.com/en/latest/peewee/api.html#Case

Dibbrun answered 30/8, 2018 at 15:17 Comment(4)
I simplified my code but I think it went wrong. I edited the question to reflect the issue, sorry , my fault. How answer changes now?Pluviometer
Thanks for the answer. Any note about replace_many method? Can be it used too? just to knowPluviometer
BTW, it seems that case query can only work on a single column, right? So to update multiple columns i should create separate update queries right? Does these multiple update queries still need optimizations?Pluviometer
You can use ValuesList() to update multiple values, but you probably ought to be comfortable with sql.Dibbrun
I
5

The new best answer is to use the bulk_update() method found here:

with database.atomic():
    User.bulk_update(user_list, fields=['username'], batch_size=50)
Irreformable answered 28/7, 2019 at 15:47 Comment(4)
As the creator of Peewee, you know. Two questions that suggest I am missing something: 1. Under what conditions does bulk_update() outperform update()? 2. The Peewee 3.9.6 documents specify: "In addition, Peewee also offers Model.bulk_update(), which can efficiently update one or more columns on a list of models." What does this specify? This was the source of my original updating of this answer.Irreformable
It is more efficient that iterating over all the items in a loop and updating them singly. The point of bulk_update is to update fields which may have multiple different values, where an ordinary UPDATE query would not suffice. e.g., with Model.update() you specify a list of fields to new values, whereas with .bulk_update() each model can have different values for the columns, and they are resolved using a CASE statement.Dibbrun
bulk_update exists only from 3.0+ versionNorby
If I want to set the same value for all elements, I still need to iterate over my user_list to set all values in the objects to my value, and then use bulk update to perform only one SQL UPDATE query, is that correct ?Bandit
D
4

You want the .update() method:

query = User.update(validated=True).where(User.username.in_(usernames_to_update))
query.execute()

Edit: so you want to conditionally set the value during an update. You can use the Case helper. Untested:

some_value_1 = 'foo'
some_value_2 = 'bar'
case_stmt = Case(User.type, [
    ('type_1', some_value_1),
    ('type_2', some_value_2)])
query = User.update(some_field=case_stmt).where(User.username.in_(list_of_usernames))
query.execute()

Docs can be found here: http://docs.peewee-orm.com/en/latest/peewee/api.html#Case

Dibbrun answered 30/8, 2018 at 15:17 Comment(4)
I simplified my code but I think it went wrong. I edited the question to reflect the issue, sorry , my fault. How answer changes now?Pluviometer
Thanks for the answer. Any note about replace_many method? Can be it used too? just to knowPluviometer
BTW, it seems that case query can only work on a single column, right? So to update multiple columns i should create separate update queries right? Does these multiple update queries still need optimizations?Pluviometer
You can use ValuesList() to update multiple values, but you probably ought to be comfortable with sql.Dibbrun

© 2022 - 2024 — McMap. All rights reserved.