filter by row_number in sqlalchemy
Asked Answered
B

2

16

how can I filter the row_number ==1 in the following query:

query = session.query(Foo, func.row_number().over(
        partition_by=Foo.foo_field, order_by=desc(Foo.foo_date_time)).label("row_number"))

    query = query.filter(Foo.time_key <= time_key)
    # query = query.filter(row_number == 1)
    result = query.all()
Blastoderm answered 2/7, 2016 at 12:35 Comment(0)
B
33

I found it:

    row_number_column = func.row_number().over(partition_by=Foo.foo_field, order_by=desc(Foo.foo_date_time)).label('row_number')
    query = self.session.query(Foo)
    query = query.filter(Foo.time_key <= time_key)
    query = query.add_column(row_number_column)
    query = query.from_self().filter(row_number_column == 1)
Blastoderm answered 2/7, 2016 at 12:58 Comment(0)
R
10

It can also be done using a subquery:

sub_query = session.query(
Foo,
func.row_number().over(partition_by=Foo.foo_field, order_by=desc(Foo.foo_date_time)).label("row_number")
)
sub_query = sub_query.filter(Foo.time_key <= time_key).subquery()
query = session.query(sub_query).filter(sub_query.c.row_number == 1)

It basically generates an equivalent SQL Statement as adding the row_number as a column.

Compiling the statement.

sub_query = session.query(
Foo,
func.row_number().over(partition_by=Foo.foo_field, order_by=desc(Foo.foo_date_time)).label("row_number")
)
sub_query = sub_query.filter(Foo.time_key <= time_key).subquery()
query = session.query(sub_query).filter(sub_query.c.row_number == 1)

str(query.statement.compile())

Will generate the following:

-- SubQuery function
SELECT anon_1.time_key, 
       anon_1.foo_field, 
       anon_1.foo_date_time, 
       anon_1.row_number 
FROM (
        SELECT  foo.time_key AS time_key,
                foo.foo_field AS foo_field, 
                foo.foo_date_time AS foo_date_time, 
                row_number() OVER (PARTITION BY foo.foo_field ORDER BY foo.foo_date_time DESC) AS row_number 
        FROM foo 
        WHERE foo.time_key <= %(time_key_1)s
    ) AS anon_1 
WHERE anon_1.row_number = %(row_number_1)s

UPDATE: Keep in mind that Query.from_self method is deprecated since SQLAlchemy 1.4 and will be removed in 2.0 according to the official documentation

In case there's someone using it, you can check the migration guide for this case

Raid answered 14/5, 2019 at 0:9 Comment(2)
For some reason, using subquery() results in model class being lost ands a row is returned as a tuple. Anyway to make it retain its type? e.g. Foo after running query.all()You
@phoxd. Not sure if it will behave the same in current versions, I remember trying to adding the whole object to sub-query to keep reference of model class with no additional params, but it was a long time ago and not exaclty a common use case for me, you would have to test it out.Raid

© 2022 - 2025 — McMap. All rights reserved.