I have a python script that executes multiple sql scripts (one after another) in Redshift. Some of the tables in these sql scripts can be queried multiple times. For ex. Table t1 can be SELECTed in one script and can be dropped/recreated in another script. This whole process is running in one transaction. Now, sometimes, I am getting deadlock detected error and the whole transaction is rolled back. If there is a deadlock on a table, I would like to wait for the table to be released and then retry the sql execution. For other types of errors, I would like to rollback the transaction. From the documentation, it looks like the table lock isn't released until end of transaction. I would like to achieve all or no data changes (which is accomplished by using transaction) but also would like to handle deadlocks. Any suggestion on how this can be accomplished?
handle locks in redshift
Asked Answered
I would execute all of the SQL you are referring to in one transaction with a retry loop. Below is the logic I use to handle concurrency issues and retry (pseudocode for brevity). I do not have the system wait indefinitely for the lock to be released. Instead I handle it in the application by retrying over time.
begin transaction
while not successful and count < 5
try
execute sql
commit
except
if error code is '40P01' or '55P03'
# Deadlock or lock not available
sleep a random time (200 ms to 1 sec) * number of retries
else if error code is '40001' or '25P02'
# "In failed sql transaction" or serialized transaction failure
rollback
sleep a random time (200 ms to 1 sec) * number of retries
begin transaction
else if error message is 'There is no active transaction'
sleep a random time (200 ms to 1 sec) * number of retries
begin transaction
increment count
The key components are catching every type of error, knowing which cases require a rollback, and having an exponential backoff for retries.
If I understand correctly, you are suggesting to commit after every sql script, and rollback just the script that errors out. Is there no way to ensure all sql scripts run under one transaction when there is deadlock error? –
Pickup
I'm suggesting you run all of your SQL inside one transaction (my
execute sql
line). If it deadlocks or fails you retry the whole thing. You may also be interested in savepoints, although I've never used them myself and don't know if it's available in redshift: postgresql.org/docs/current/static/sql-savepoint.html –
Dulcedulcea Ignore my comment about using savepoints. Rollback to savepoint is not available in redshift: docs.aws.amazon.com/redshift/latest/dg/… –
Dulcedulcea
Other things to take into account with Redshift and transactions - you need to ensure your connection has AutoCommit set to false or the usage of the rollback command will be an error in itself and all the stuff you wanted to rolled back will be committed. The AutoCommit feature on a connection ignores transaction structures. –
Shiri
© 2022 - 2024 — McMap. All rights reserved.