Doctrine2: Limiting with Left Joins / Pagination - Best Practice
Asked Answered
R

2

7

i have a big query (in my query builder) and a lot of left joins. So i get Articles with their comments and tags and so on. Let's say i have the following dql:

$dql = 'SELECT blogpost, comment, tags 
FROM BlogPost blogpost 
LEFT JOIN blogpost.comments comments
LEFT JOIN blogpost.tags tags';

Now let's say my database has more than 100 blogposts but i only want the first 10, but with all the comments of those 10 and all their tags, if they exist. If i use setMaxResults it limits the Rows. So i might get the first two Posts, but the last one of those is missing some of it's comments or tags. So the followin doesn't work.

$result = $em->createQuery($dql)->setMaxResults(15)->getResult();

Using the barely documented Pagination Solution that ships with doctrine2.2 doesn't really work for me either since it is so slow, i could as well load all the data.

I tried the Solutions in the Stackoverflow Article, but even that Article is still missing a Best Practise and the presented Solution is deadly slow.

Isn't there a best practise on how to do this? Is nobody using Doctrine2.2 in Production mode?

Raby answered 22/5, 2012 at 12:10 Comment(1)
Please add the code you have to your question, maybe even with example results to show what you want, and what you are getting.Corium
P
11

Getting the proper results with a query like this is problematic. There is a tutorial on the Doctrine website explaining this problem.

Pagination

The tutorial is more about pagination rather than getting the top 5 results, but the overall idea is that you need to do a "SELECT DISTINCT a.id FROM articles a ... LIMIT 5" instead of a normal SELECT. It's a little more complicated than this, but the last 2 points in that tutorial should put you on the right track.

Update:

The problem here is not Doctrine, or any other ORM. The problem lies squarely on the database being able to return the results you're asking for. This is just how joins work.

If you do an EXPLAIN on the query, it will give you a more in depth answer of what is happening. It would be a good idea to add the results of that to your initial question.

Building on what is discussed in the Pagination article, it would appear that you need at least 2 queries to get your desired results. Adding DISTINCT to a query has the potential to dramatically slow down your query, but its only really needed if you have joins in it. You could write another query that just retrieves the first 10 posts ordered by created date, without the joins. Once you have the IDs of those 10 posts, do another query with your joins, and a WHERE blogpost.id IN (...) ORDER BY blogpost.created. This method should be much more efficient.

SELECT 
    bp 
FROM 
    Blogpost bp 
ORDER BY 
    bp.created DESC
LIMIT 10

Since all you care about in the first query are the IDs, you could set Doctrine to use Scalar Hydration.

SELECT 
    bg 
FROM 
    Blogpost bp 
LEFT JOIN 
    bp.comments c 
LEFT JOIN 
    bp.tags t 
WHERE 
    bp.id IN (...) 
ORDER BY 
    bp.created DESC

You could also probably do it in one query using a correlated subquery. The myth that subqueries are always bad is NOT true. Sometimes they are faster than joins. You will need to experiment to find out what the best solution is for you.

Piatt answered 8/6, 2012 at 0:42 Comment(0)
C
2

Edit in light of the clarified question:

You can do what you want in native MySQL using a subquery in the FROM clause as such:

SELECT * FROM 
 (SELECT * FROM articles ORDER BY date LIMIT 5) AS limited_articles, 
 comments, 
 tags 
WHERE
 limited_articles.article_id=comments.article_id
 limited_articles.article_id=tags.article_id

As far as I know, DQL does not support subqueries like this, so you can use the NativeQuery class.

Carnap answered 7/6, 2012 at 23:42 Comment(3)
No, i don't want the most recent M of my N. I want the most recent N with all their M,X and Y.Raby
I am not a big fan of Using a DABL/ORM Framework and then still using native Queries. Doesn't make much sense to me.Raby
Then it looks like you will need to make two queries. I agree that there should be a better way to limit and page results of joins.Carnap

© 2022 - 2024 — McMap. All rights reserved.