SQLAlchemy: cascade delete
Asked Answered
P

10

188

I must be missing something trivial with SQLAlchemy's cascade options because I cannot get a simple cascade delete to operate correctly -- if a parent element is a deleted, the children persist, with null foreign keys.

I've put a concise test case here:

from sqlalchemy import Column, Integer, ForeignKey
from sqlalchemy.orm import relationship

from sqlalchemy import create_engine
from sqlalchemy.orm import sessionmaker
from sqlalchemy.ext.declarative import declarative_base

Base = declarative_base()

class Parent(Base):
    __tablename__ = "parent"
    id = Column(Integer, primary_key = True)

class Child(Base):
    __tablename__ = "child"
    id = Column(Integer, primary_key = True)
    parentid = Column(Integer, ForeignKey(Parent.id))
    parent = relationship(Parent, cascade = "all,delete", backref = "children")

engine = create_engine("sqlite:///:memory:")
Base.metadata.create_all(engine)
Session = sessionmaker(bind=engine)

session = Session()

parent = Parent()
parent.children.append(Child())
parent.children.append(Child())
parent.children.append(Child())

session.add(parent)
session.commit()

print "Before delete, children = {0}".format(session.query(Child).count())
print "Before delete, parent = {0}".format(session.query(Parent).count())

session.delete(parent)
session.commit()

print "After delete, children = {0}".format(session.query(Child).count())
print "After delete parent = {0}".format(session.query(Parent).count())

session.close()

Output:

Before delete, children = 3
Before delete, parent = 1
After delete, children = 3
After delete parent = 0

There is a simple, one-to-many relationship between Parent and Child. The script creates a parent, adds 3 children, then commits. Next, it deletes the parent, but the children persist. Why? How do I make the children cascade delete?

Parsonage answered 17/2, 2011 at 19:22 Comment(2)
This section in the docs (at least now, 3 years later after the original post) seems quite helpful on this: docs.sqlalchemy.org/en/rel_0_9/orm/session.html#cascadesUlterior
I think the session.commit() after session.delete(parent) is unnecessary.Hixon
G
279

The problem is that sqlalchemy considers Child as the parent, because that is where you defined your relationship (it doesn't care that you called it "Child" of course).

If you define the relationship on the Parent class instead, it will work:

children = relationship("Child", cascade="all,delete", backref="parent")

(note "Child" as a string: this is allowed when using the declarative style, so that you are able to refer to a class that is not yet defined)

You might want to add delete-orphan as well (delete causes children to be deleted when the parent gets deleted, delete-orphan also deletes any children that were "removed" from the parent, even if the parent is not deleted)

EDIT: just found out: if you really want to define the relationship on the Child class, you can do so, but you will have to define the cascade on the backref (by creating the backref explicitly), like this:

parent = relationship(Parent, backref=backref("children", cascade="all,delete"))

(implying from sqlalchemy.orm import backref)

Gallstone answered 17/2, 2011 at 20:15 Comment(5)
This is well explained in the current doc docs.sqlalchemy.org/en/rel_0_9/orm/cascades.htmlSites
@Gallstone - can you please explain what "removed from parent" means?Epicene
@Lyman Zerga: in the OP's example: if you remove a Child object from parent.children, should that object be deleted from the database, or should only it's reference to the parent be removed (ie. set parentid column to null, instead of deleting the row)Gallstone
Wait, the relationship doesn't dictate the parent-child setup. Using ForeignKey on a table is what sets it up as the child. It doesn't matter if the relationship is on the parent or child.Holmann
just a note, for cascade in relation, you need to add it to the parent table, NOT the child table.Yhvh
H
193

Pretty old post, but I just spent an hour or two on this, so I wanted to share my finding, especially since some of the other comments listed aren't quite right.

TL;DR

Give the child table a foreign or modify the existing one, adding ondelete='CASCADE':

parent_id = db.Column(db.Integer, db.ForeignKey('parent.id', ondelete='CASCADE'))

And one of the following relationships:

a) This on the parent table:

children = db.relationship('Child', backref='parent', passive_deletes=True)

b) Or this on the child table:

parent = db.relationship('Parent', backref=backref('children', passive_deletes=True))

Details

First off, despite what the accepted answer says, the parent/child relationship is not established by using relationship, it's established by using ForeignKey. You can put the relationship on either the parent or child tables and it will work fine. Although, apparently on the child tables, you have to use the backref function in addition to the keyword argument.

Option 1 (preferred)

Second, SqlAlchemy supports two different kinds of cascading. The first, and the one I recommend, is built into your database and usually takes the form of a constraint on the foreign key declaration. In PostgreSQL it looks like this:

CONSTRAINT child_parent_id_fkey FOREIGN KEY (parent_id)
REFERENCES parent_table(id) MATCH SIMPLE
ON DELETE CASCADE

This means that when you delete a record from parent_table, then all the corresponding rows in child_table will be deleted for you by the database. It's fast and reliable and probably your best bet. You set this up in SqlAlchemy through ForeignKey like this (part of the child table definition):

parent_id = db.Column(db.Integer, db.ForeignKey('parent.id', ondelete='CASCADE'))
parent = db.relationship('Parent', backref=backref('children', passive_deletes=True))

The ondelete='CASCADE' is the part that creates the ON DELETE CASCADE on the table.

Gotcha!

There's an important caveat here. Notice how I have a relationship specified with passive_deletes=True? If you don't have that, the entire thing will not work. This is because by default when you delete a parent record SqlAlchemy does something really weird. It sets the foreign keys of all child rows to NULL. So if you delete a row from parent_table where id = 5, then it will basically execute

UPDATE child_table SET parent_id = NULL WHERE parent_id = 5

Why you would want this I have no idea. I'd be surprised if many database engines even allowed you to set a valid foreign key to NULL, creating an orphan. Seems like a bad idea, but maybe there's a use case. Anyway, if you let SqlAlchemy do this, you will prevent the database from being able to clean up the children using the ON DELETE CASCADE that you set up. This is because it relies on those foreign keys to know which child rows to delete. Once SqlAlchemy has set them all to NULL, the database can't delete them. Setting the passive_deletes=True prevents SqlAlchemy from NULLing out the foreign keys.

You can read more about passive deletes in the SqlAlchemy docs.

Option 2

The other way you can do it is to let SqlAlchemy do it for you. This is set up using the cascade argument of the relationship. If you have the relationship defined on the parent table, it looks like this:

children = relationship('Child', cascade='all,delete', backref='parent')

If the relationship is on the child, you do it like this:

parent = relationship('Parent', backref=backref('children', cascade='all,delete'))

Again, this is the child so you have to call a method called backref and putting the cascade data in there.

With this in place, when you delete a parent row, SqlAlchemy will actually run delete statements for you to clean up the child rows. This will likely not be as efficient as letting this database handle if for you so I don't recommend it.

Here are the SqlAlchemy docs on the cascading features it supports.

Holmann answered 4/8, 2016 at 14:20 Comment(9)
Why does declaring a Column in the child table as ForeignKey('parent.id', ondelete='cascade', onupdate='cascade') not work, either? I expected the children to be deleted when their parent table row got deleted too. Instead, SQLA either sets the children to a parent.id=NULL or leaves them "as is", but no deletes. That's after originally defining the relationship in the parent as children = relationship('Parent', backref='parent') or relationship('Parent', backref=backref('parent', passive_deletes=True)); DB shows cascade rules in the DDL (SQLite3-based proof-of-concept). Thoughts?Amund
Also, I should note that when I use backref=backref('parent', passive_deletes=True) I get the following warning: SAWarning: On Parent.children, 'passive_deletes' is normally configured on one-to-many, one-to-one, many-to-many relationships only. "relationships only." % self, suggesting it doesn't like the use of passive_deletes=True in this (obvious) one-to-many parent-child relationship for some reason.Amund
Great explanation. One question - is delete redundant in cascade='all,delete' ?Constriction
@Constriction delete IS redundant in cascade='all,delete', since according to the SQLAlchemy's docs, all is a synonym for: save-update, merge, refresh-expire, expunge, deleteBucko
@Bucko I don't think it is redundant since it is heavily used in the docs: docs.sqlalchemy.org/en/13/orm/….Synchronic
Thanks for the Gotcha part. Just what pawned me. This answer probably deserves to be accepted.Synchronic
I got the same condition as setting relationship('child',backref='parent', cascade="all, delete", passive_deletes=True) and Column(..., ForeignKey('parent.id', ondelete="CASCADE")) and still don't work, when I try to delete the parent row the child row FK are set to NULL. I add nullable=False in the child FK column as Column(..., ForeignKey('parent.id', ondelete="CASCADE"), nullable=False) and it works, not sure if this is the reason why cannot delete the child row.Preglacial
Also read about passive_deletes="all" behaviourBleareyed
@d512, please add cascade="all, delete" to the relationship to handle the case where some children are loaded. The current docs are quite clear about that. Without that I still get the children's parent_id set to null.Wahl
E
173

@Steven's asnwer is good when you are deleting through session.delete() which never happens in my case. I noticed that most of the time I delete through session.query().filter().delete() (which doesn't put elements in the memory and deletes directly from db). Using this method sqlalchemy's cascade='all, delete' doesn't work. There is a solution though: ON DELETE CASCADE through db (note: not all databases support it).

class Child(Base):
    __tablename__ = "children"

    id = Column(Integer, primary_key=True)
    parent_id = Column(Integer, ForeignKey("parents.id", ondelete='CASCADE'))

class Parent(Base):
    __tablename__ = "parents"

    id = Column(Integer, primary_key=True)
    child = relationship(Child, backref="parent", passive_deletes=True)
Elmaleh answered 9/10, 2012 at 13:52 Comment(13)
Thanks for explaining this difference - I was trying to use session.query().filter().delete() and struggling to find the issueOidea
I had to set passive_deletes='all' in order to get the children to be deleted by the database cascade when the parent is deleted. With passive_deletes=True, children objects were getting disassociated (parent set to NULL) before the parent is deleted, so the database cascade wasn't doing anything.Ayn
@MiloradPop-Tosic I haven't used SQLAlchemy for over 3 years but reading the doc looks like passive_deletes=True is still the right thing.Elmaleh
I can confirm that passive_deletes=True does work correctly in this scenario.Holmann
I was having trouble with alembic auto-generating revisions that included cascade on delete - this was the answer.Augmenter
@MiloradPop-Tosic Your comment was the only thing that fixed this for me, I wish it was an answer that I could upvoteIndusium
To impart more clarity to this answer (and some follow-up comments), the syntax session.query(Parent).filter().delete() will always work if you set ondelete='cascade' on the child's foreign key declaration (i.e. the cascade rules are specified directly at the database level). This works whether or not passive_delete has been declared on the relationship (i.e. at the orm level) because the query is an actual sql DELETE. The object is not first loaded to then be deleted later with session.delete(parent), so the orm is never involved.Clemens
On the other hand, if you delete a parent using session.delete(parent) (meaning first loading an object then deleting it with the orm), then the orm mechanism needs to know (1) if it should be involved in the cascade to children objects and in that case, (2) how it should handle it (set null, delete, restrict). passive_delete addresses (1) and is particularly useful in situations where you already have foreign key rules set at the db level and wish to let the db handle cascades.Clemens
Because I ended up at this answer three times now while getting the deletes done: For SQLite it is vital to enable foreign_keys. In short: cursor.execute('PRAGMA foreign_keys=ON') -- using event.listens_for as explained in docs.sqlalchemy.org/en/13/dialects/… (edit: and explained in another answer stackoverflow.com/a/62327279)Barleycorn
Wish I could upvote this answer more than once - it took me waaaay too long to figure out that x = session.query(T).all(); [session.delete(y) for y in x] is not the same as session.query(T).delete(). It's particularly non-obvious when using flask_sqlalchemy, which gives you T.query.delete() which really looks like it should take into account the relationships for some reason.Russ
And now in 2.0 they are deprecating session.query and showing session.execute as the alternative.Bestir
if you are using session.query, this is the ANSWER. saved my bacon. thank you!!!Beret
You should add cascade="all, delete" to handle the case where some children are loaded. @Ayn pointed to the problem already, but the docs say you should add cascade="all, delete" not use passive_deletes='all'.Wahl
I
9

Alex Okrushko answer almost worked best for me. Used ondelete='CASCADE' and passive_deletes=True combined. But I had to do something extra to make it work for sqlite.

Base = declarative_base()
ROOM_TABLE = "roomdata"
FURNITURE_TABLE = "furnituredata"

class DBFurniture(Base):
    __tablename__ = FURNITURE_TABLE
    id = Column(Integer, primary_key=True)
    room_id = Column(Integer, ForeignKey('roomdata.id', ondelete='CASCADE'))


class DBRoom(Base):
    __tablename__ = ROOM_TABLE
    id = Column(Integer, primary_key=True)
    furniture = relationship("DBFurniture", backref="room", passive_deletes=True)

Make sure to add this code to ensure it works for sqlite.

from sqlalchemy import event
from sqlalchemy.engine import Engine
from sqlite3 import Connection as SQLite3Connection

@event.listens_for(Engine, "connect")
def _set_sqlite_pragma(dbapi_connection, connection_record):
    if isinstance(dbapi_connection, SQLite3Connection):
        cursor = dbapi_connection.cursor()
        cursor.execute("PRAGMA foreign_keys=ON;")
        cursor.close()

Stolen from here: SQLAlchemy expression language and SQLite's on delete cascade

Inevasible answered 11/6, 2020 at 14:57 Comment(0)
M
7

Steven is correct in that you need to explicitly create the backref, this results in the cascade being applied on the parent (as opposed to it being applied to the child like in the test scenario).

However, defining the relationship on the Child does NOT make sqlalchemy consider Child the parent. It doesn't matter where the relationship is defined (child or parent), its the foreign key that links the two tables that determines which is the parent and which is the child.

It makes sense to stick to one convention though, and based on Steven's response, I'm defining all my child relationships on the parent.

Missymist answered 21/4, 2013 at 13:57 Comment(0)
F
7

Steven's answer is solid. I'd like to point out an additional implication.

By using relationship, you're making the app layer (Flask) responsible for referential integrity. That means other processes that access the database not through Flask, like a database utility or a person connecting to the database directly, will not experience those constraints and could change your data in a way that breaks the logical data model you worked so hard to design.

Whenever possible, use the ForeignKey approach described by d512 and Alex. The DB engine is very good at truly enforcing constraints (in an unavoidable way), so this is by far the best strategy for maintaining data integrity. The only time you need to rely on an app to handle data integrity is when the database can't handle them, e.g. versions of SQLite that don't support foreign keys.

If you need to create further linkage among entities to enable app behaviors like navigating parent-child object relationships, use backref in conjunction with ForeignKey.

Fu answered 27/9, 2017 at 22:22 Comment(0)
O
6

I struggled with the documentation as well, but found that the docstrings themselves tend to be easier than the manual. For example, if you import relationship from sqlalchemy.orm and do help(relationship), it will give you all the options you can specify for cascade. The bullet for delete-orphan says:

if an item of the child's type with no parent is detected, mark it for deletion.
Note that this option prevents a pending item of the child's class from being persisted without a parent present.

I realize your issue was more with the way the documentation for defining parent-child relationships. But it seemed that you might also be having a problem with the cascade options, because "all" includes "delete". "delete-orphan" is the only option that's not included in "all".

Outturn answered 3/6, 2011 at 11:16 Comment(1)
Using help(..) on the sqlalchemy objects helps a lot! Thanks :-))) ! PyCharm shows nothing in context docks, and plainly forgot to check the help. Thank you a lot!Verbalize
I
3

Even tho this question is very old, it comes up first when searched for in Google so I'll post my solution to add up to what others said (I've spent few hours even after reading all the answers in here).

As d512 explained, it is all about Foreign Keys. It was quite a surprise to me but not all databases / engines support Foreign Keys. I'm running a MySQL database. After long investigation, I noticed that when I create new table it defaults to an engine (MyISAM) that doesn't support Foreign Keys. All I had to do was to set it to InnoDB by adding mysql_engine='InnoDB' when defining a Table. In my project I'm using an imperative mapping and it looks like so:

db.Table('child',
    Column('id', Integer, primary_key=True),
    # other columns
    Column('parent_id',
           ForeignKey('parent.id', ondelete="CASCADE")),
    mysql_engine='InnoDB')
Introduction answered 26/6, 2021 at 16:35 Comment(0)
D
2

Answer by Stevan is perfect. But if you are still getting the error. Other possible try on top of that would be -

http://vincentaudebert.github.io/python/sql/2015/10/09/cascade-delete-sqlalchemy/

Copied from the link-

Quick tip if you get in trouble with a foreign key dependency even if you have specified a cascade delete in your models.

Using SQLAlchemy, to specify a cascade delete you should have cascade='all, delete' on your parent table. Ok but then when you execute something like:

session.query(models.yourmodule.YourParentTable).filter(conditions).delete()

It actually triggers an error about a foreign key used in your children tables.

The solution I used it to query the object and then delete it:

session = models.DBSession()
your_db_object = session.query(models.yourmodule.YourParentTable).filter(conditions).first()
if your_db_object is not None:
    session.delete(your_db_object)

This should delete your parent record AND all the children associated with it.

Dictaphone answered 20/6, 2019 at 22:41 Comment(2)
Is calling .first() required? What filter conditions return a list of objects, and everything has to be deleted? Isn't calling .first() gets only the first object? @PrashantDreiser
I've come across exactly this problem. I would like to clear my whole table in pytest fixture, but session.query(Parent).delete() doesn't seem to be working with foreign keys. Querying object and then iterating through each on separately doesn't seem to be the correct solution, at least it shouldn't be....Patty
E
1

TLDR: If the above solutions don't work, try adding nullable=False to your column.

I'd like to add a small point here for some people who may not get the cascade function to work with the existing solutions (which are great). The main difference between my work and the example was that I used automap. I do not know exactly how that might interfere with the setup of cascades, but I want to note that I used it. I am also working with a SQLite database.

I tried every solution described here, but rows in my child table continued to have their foreign key set to null when the parent row was deleted. I'd tried all the solutions here to no avail. However, the cascade worked once I set the child column with the foreign key to nullable = False.

On the child table, I added:

Column('parent_id', Integer(), ForeignKey('parent.id', ondelete="CASCADE"), nullable=False)
Child.parent = relationship("parent", backref=backref("children", passive_deletes=True)

With this setup, the cascade functioned as expected.

Eudocia answered 17/10, 2019 at 19:5 Comment(2)
I also use automap and only setting the nullable=False that I can delete the child rows.Preglacial
Don’t write "the above solutions" because the order in which answers appear can vary. All the answers above yours are more recent, so I believe you refer to the ones below.Selffertilization

© 2022 - 2024 — McMap. All rights reserved.