Matching nested model association attribute with includes
Asked Answered
F

4

6

Suppose I have the following models:

class Post < ActiveRecord::Base
  has_many :authors

class Author < ActiveRecord::Base
  belongs_to :post

And suppose the Author model has an attribute, name.

I want to search for all posts with a given author "alice", by that author's name. Say there is another author "bob" who co-authored a post with alice.

If I search for the first result using includes and where:

post = Post.includes(:authors).where("authors.name" => "alice").first

You'll see that the post only has one author now, even if in fact there are more:

post.authors #=> [#<Author id: 1, name: "alice", ...>]
post.reload
post.authors #=> [#<Author id: 1, name: "alice", ...>, #<Author id: 2, name: "bob", ...>]

The problem seems to be the combination of includes and where, which limits the scope correctly to the desired post, but at the same time hides all associations except for the one that is matched.

I want to end up with an ActiveRecord::Relation for chaining, so the reload solution above is not really satisfactory. Replacing includes by joins solves this, but does not eager load the associations:

Post.joins(:authors).where("authors.name" => "alice").first.authors
#=> [#<Author id: 1, name: "alice", ...>, #<Author id: 2, name: "bob", ...>]
Post.joins(:authors).where("authors.name" => "alice").first.authors.loaded?
#=> false

Any suggestions? Thanks in advance, I've been banging my head over this problem for a while.

Farinaceous answered 9/2, 2012 at 3:33 Comment(1)
note: I realize the post/author association should more realistically be HABTM, but it doesn't change anything for the purpose of this issue.Farinaceous
F
1

Coming back to this question after a long long time, I realized there is a better way to do this. The key is to do not one but two joins, one with includes and one with Arel using a table alias:

posts   = Post.arel_table
authors = Author.arel_table.alias("matching_authors")
join    = posts.join(authors, Arel::Nodes::InnerJoin).
                on(authors[:post_id].eq(posts[:id])).join_sources

post = Post.includes(:authors).joins(join).
            where(matching_authors: { name: "Alice" }).first

The SQL for this query is quite long since it has includes, but the key point is that it has not one but two joins, one (from includes) using a LEFT OUTER JOIN on the alias posts_authors, the other (from the Arel join) using an INNER JOIN on the alias matching_authors. The WHERE only applies to the latter alias, so results on the association in the returned results are not limited by this condition.

Farinaceous answered 27/5, 2018 at 0:8 Comment(0)
R
1

I see what you're doing as expected behaviour, at least that's how SQL works... You're restricting the join on authors to where authors.id = 1, so why would it load any others? ActiveRecord just takes the rows that the database returned, it has no way of knowing if there are others, without doing another query based on the posts.id.

Here's one possible solution with a subquery, this will work as a chainable relation, and executes in one query:

relation = Post.find_by_id(id: Author.where(id:1).select(:post_id))

If you add the includes, you will see the queries happen one of two ways:

relation = relation.includes(:authors)

relation.first
# 1. Post Load SELECT DISTINCT `posts`.`id`...
# 2. SQL SELECT `posts`.`id` AS t0_r0, `posts`.`title` AS t0_r1, ...

relation.all.first
# 1. SQL SELECT `posts`.`id` AS t0_r0, `posts`.`title` AS t0_r1, ...

So depending on the scenario, ActiveRecord decides whether to look up the id with a simpler query before loading all the associated authors. Sometimes it makes more sense to run the query in 2 steps.

Ralphralston answered 22/7, 2012 at 11:9 Comment(1)
Thanks very much, this makes a lot of sense. I still find the behaviour of rails in this case unintuitive, and not what I think the average user would expect, but perhaps that's just inevitable.Farinaceous
F
1

Coming back to this question after a long long time, I realized there is a better way to do this. The key is to do not one but two joins, one with includes and one with Arel using a table alias:

posts   = Post.arel_table
authors = Author.arel_table.alias("matching_authors")
join    = posts.join(authors, Arel::Nodes::InnerJoin).
                on(authors[:post_id].eq(posts[:id])).join_sources

post = Post.includes(:authors).joins(join).
            where(matching_authors: { name: "Alice" }).first

The SQL for this query is quite long since it has includes, but the key point is that it has not one but two joins, one (from includes) using a LEFT OUTER JOIN on the alias posts_authors, the other (from the Arel join) using an INNER JOIN on the alias matching_authors. The WHERE only applies to the latter alias, so results on the association in the returned results are not limited by this condition.

Farinaceous answered 27/5, 2018 at 0:8 Comment(0)
T
1

I ran into the same issue (which I describe as: where clause filters the associated model, rather than the primary model, when includes is used to prevent N+1 queries).

After flailing around trying various solutions, I found that using preload in conjunction with joins solves this for me. The Rails documentation is not super useful here. But apparently preload will explicitly use two separate queries, one to filter/select the primary model, and a second query to load the associated models. This blog post also has some insights that helped lead me to the solution.

Applying this to your models would be something like:

post = Post.preload(:authors).joins(:authors).where("authors.name" => "alice").first

I suspect that under the covers this is doing the same thing as your accepted answer, but at a nicer level of abstraction.

I wish the Rails docs were more explicit about how to do this. It's subtle enough that I wrote a bunch of tests around this precise situation in my code base.

There answered 16/12, 2018 at 14:28 Comment(1)
Thanks for your comment! That is indeed another way to do it. However, just to clarify, any solution which involves more than one query is different from the accepted solution. The accepted solution uses two joins on the same table (one aliased), but only one query. It is somewhat complicated and may not necessarily be more performant than a two-query solution, but it does what I originally wanted to do. You're right that it would be nice if Rails could be more explicit about how to solve this kind of problem, since it must be fairly common.Farinaceous
G
-2

Actually, it's because this code:

post = Post.includes(:authors).where("authors.name" => "alice").first

returns the first matched record because of the ".first". I think if you did this:

post = Post.includes(:authors).where("authors.name" => "alice")

you would get back all posts with "alice" and her other co-authors if I understand what you're asking correctly.

Gravois answered 16/7, 2012 at 23:27 Comment(1)
No that's not the problem. The query returns the correct posts (in this case the ones where alice is one of the authors). But when I access the results of that query, the authors association on the post returned (in this case the first one, but it doesn't matter) has only the author I searched for, even if there may be others.Farinaceous

© 2022 - 2024 — McMap. All rights reserved.