MySQL JOIN with LIMIT 1 on joined table
Asked Answered
S

12

98

I want to join two tables, but only get 1 record of table2 per record on table1

For example:

SELECT c.id, c.title, p.id AS product_id, p.title
FROM categories AS c
JOIN products AS p ON c.id = p.category_id

This would get me all records in products, which is not what I want. I want 1 [the first] product per category (I have a sort column in the products field).

How do I go about doing that?

Spiderwort answered 29/7, 2011 at 21:49 Comment(0)
B
-43

I would try something like this:

SELECT C.*,
      (SELECT P.id, P.title 
       FROM products as P
       WHERE P.category_id = C.id
       LIMIT 1)
FROM categories C
Bushwhack answered 29/7, 2011 at 22:1 Comment(2)
this throws error, Error Code: 1241. Operand should contain 1 column(s)Opportune
Question is on join.Copra
D
127

I like more another approach described in a similar question: https://mcmap.net/q/218764/-mysql-join-with-limit-1

This approach is better especially in case if you need to show more than one field in SELECT. To avoid Error Code: 1241. Operand should contain 1 column(s) or double sub-select for each column.

For your situation the Query should looks like (this query also will work in PostgresQL and it is pretty fast, see my update below):

This is the fastest query. Use this one.

SELECT
 c.id,
 c.title,
 p.id AS product_id,
 p.title AS product_title
FROM categories AS c
JOIN products AS p ON
 p.id = (                                 --- the PRIMARY KEY
  SELECT p1.id FROM products AS p1
  WHERE c.id=p1.category_id
  ORDER BY p1.id LIMIT 1
 )

PS. I did the performance test of the query vs other proposed here, and this query is the best option yet!

UPDATE (2022-07-20, PostgresSQL)

I'm not working with mySQL for a while already, so, I decided to test the performance of my solution (which actually works perfect in both MySQL and PostgresQL) with solution provided by @Gravy in PostgresQL v.12.9.

For that I decided to create a dummy tables and data with 100 categories and 100000 products. You can check the code on this gist

I run my query above and it took only 13ms to run.

After I slightly modified (for postgres) the query from @Gravy:

This is slow, do not use it!

SELECT
  id,
  category_title,
  (array_agg(product_title))[1]  
FROM
    (SELECT c.id, c.title AS category_title, p.id AS product_id, p.title AS product_title
    FROM categories AS c
    JOIN products AS p ON c.id = p.category_id
    ORDER BY c.id ASC) AS a 
GROUP BY id, category_title;

and run it too. It took more than 150ms in my machine. Which is >10x times slower.

In defense of @gravy's solution, I agree with n+1 problem. But, in this particular case, usually the number of products is way larger than categories. So, running through each category is way less expensive than running through each product as in @Gravy's query.

By the way, if your table has 1mln products with 100 categories, the speed of my query is still the same (between 9-17ms), but the query from [@Gravy] takes more than 2 seconds to run

In resume, at this moment, my query is the most performant and optimal solution for the current task.

Feel free to comment.

Dermatitis answered 27/6, 2014 at 1:22 Comment(7)
@BenoitDuffez i bet its more difficult by exactly one degree (the subselect).Lowminded
@BenoitDuffez - See my solution to answer your question re performance impactFlagella
Unfortunately, too slow :(Specter
This is slower than @Gravy's answer.Unifoliolate
I just updated the answer with the performance test, and my query is >10x times faster than @Gravy's solution.Dermatitis
Been struggling for hours but this is the answer I needed! Thank you and thanks to ypercubeᵀᴹ for the answer you linked toRollandrollaway
Can you kindly "convert" this query for MariaDB current?Inchmeal
F
34

Accepted answer by @goggin13 looks wrong. Other solutions provided to-date will work, but suffer from the n+1 problem and as such, suffer a performance hit.

n+1 problem: If there are 100 categories, then we would have to do 1 select to get the categories, then for each of the 100 categories returned, we would need to do a select to get the products in that category. So 101 SELECT queries would be performed.

My alternative solution solves the n+1 problem and consequently should be significantly more performant as only 2 selects are being performed.

SELECT
  *
FROM
    (SELECT c.id, c.title, p.id AS product_id, p.title
    FROM categories AS c
    JOIN products AS p ON c.id = p.category_id
    ORDER BY c.id ASC) AS a 
GROUP BY id;
Flagella answered 4/3, 2015 at 11:37 Comment(5)
Interesting. Normally if you use ORDER BY and GROUP BY in a single query the GROUP BY will run first and ORDER BY will sort the groups. I take it that is the way to make ORDER BY go first.Burstone
@German - That's correct. And that way, you can select group_concat ensuring that your results are in the correct order.Flagella
Can anyone suggest a tweak to make this work with MySQL 5.7? It complains about non aggregated columns since ONLY_FULL_GROUP_BY is enabled by default. Thanks!Isocline
@ChrisBartley - Simply add the missing group by fields on any aggregated columns. You could also use ANY_VALUE() as per dev.mysql.com/doc/refman/5.7/en/group-by-handling.htmlFlagella
Hey, thank you for your solution. I took a time, and wrote a speed test for both, yours and mine queries. As I was expected, you query is >10x times slower. The main reason is because usually there are much more products in the table than categories, and your query runs through products table instead of mine, that runs through the categories. Please see my answer, I updated it.Dermatitis
S
17
SELECT c.id, c.title, p.id AS product_id, p.title
FROM categories AS c
JOIN products AS p ON c.id = p.category_id
GROUP BY c.id

This will return the first data in products (equals limit 1)

Scarborough answered 1/10, 2014 at 16:45 Comment(3)
Grouping seems to have a massive overhead. If I run the query without it (with a limit) it is much faster. I assume it's ignoring indexes?Overmatter
this will return one, but would that necessarily be the first?Louielouis
Easy to implement and if necessary you can group by multiple fields.Beason
L
7

What about this?

SELECT c.id, c.title, (SELECT id from products AS p 
                            WHERE c.id = p.category_id 
                            ORDER BY ... 
                            LIMIT 1)
   FROM categories AS c;
Lussier answered 29/7, 2011 at 22:0 Comment(0)
G
4

The With clause would do the trick. Something like this:

WITH SELECTION AS (SELECT id FROM products LIMIT 1)
SELECT a.id, c.id, c.title FROM selection a JOIN categories c ON (c.id = a.id);
Gomez answered 15/6, 2016 at 16:20 Comment(2)
As a sidenote, this will not work for MySQL 5.7 or lower... "WITH ... AS" syntax is available in MySQL 8, however.Gaul
No, it won't: db-fiddle.com/f/66RqT8qPrTfh9zmgCnqrrG/0 (schema taken from https://mcmap.net/q/218764/-mysql-join-with-limit-1/1073003)Hohenlohe
R
1
SELECT
 c.id,
 c.title,
 p.id AS product_id,
 p.title AS product_title
FROM categories AS c
JOIN products AS p ON
 p.id = (                               
  SELECT MIN(p1.id) FROM products AS p1
  WHERE c.id=p1.category_id
 )

Using MIN or MAX in a subquery will make your query run much faster.

Riannon answered 28/11, 2022 at 17:55 Comment(0)
A
0

Assuming you want product with MIN()imial value in sort column, it would look something like this.

SELECT 
  c.id, c.title, p.id AS product_id, p.title
FROM 
  categories AS c
INNER JOIN (
  SELECT
    p.id, p.category_id, p.title
  FROM
    products AS p
  CROSS JOIN (
    SELECT p.category_id, MIN(sort) AS sort
    FROM products
    GROUP BY category_id
  ) AS sq USING (category_id)
) AS p ON c.id = p.category_id
Addis answered 29/7, 2011 at 21:58 Comment(0)
N
0

Another example with 3 nested tables: 1/ User 2/ UserRoleCompanie 3/ Companie

  • 1 user has many UserRoleCompanie.
  • 1 UserRoleCompanie has 1 user and 1 Company
  • 1 Companie has many UserRoleCompanie
SELECT 
u.id as userId, 
u.firstName,
u.lastName,
u.email,
urc.id ,
urc.companieRole,
c.id as companieId,
c.name as companieName
FROM User as u 
JOIN UserRoleCompanie as urc ON u.id = urc.userId
    AND urc.id = (
        SELECT urc2.id
        FROM UserRoleCompanie urc2 
        JOIN Companie ON urc2.companieId = Companie.id
        AND urc2.userId = u.id 
        AND Companie.isPersonal = false
        order by Companie.createdAt DESC
        
        limit 1
    )
    
LEFT JOIN Companie as c ON urc.companieId = c.id
+---------------------------+-----------+--------------------+---------------------------+---------------------------+--------------+---------------------------+-------------------+
| userId                    | firstName | lastName           | email                     | id                        | companieRole | companieId                | companieName      |
+---------------------------+-----------+--------------------+---------------------------+---------------------------+--------------+---------------------------+-------------------+
| cjjt9s9iw037f0748raxmnnde | henry     | pierrot            | [email protected]           | cjtuflye81dwt0748e4hnkiv0 | OWNER        | cjtuflye71dws0748r7vtuqmg | leclerc           |
Neuropsychiatry answered 22/9, 2020 at 18:52 Comment(0)
G
0

In my opinion, this is the best answer (making it general):

SELECT  
TB1.Id  
FROM Table1 AS TB1  
INNER JOIN Table2 AS TB2 ON (TB1.Id = TB2.Id_TB1)  
    AND TB2.Id = (  
        SELECT Id  
        FROM Table2  
        WHERE TB1.Id = Id_TB1  
        ORDER BY Table2.Id DESC  
        LIMIT 1  
    )  
Grisons answered 1/3, 2021 at 13:47 Comment(1)
You're responding to a 10-year-old question. This answer will not limit the result, since the WHERE clause doesn't seem to do anything to make the JOIN selection unique.Camenae
I
-1

When using postgres you can use the DISTINCT ON syntex to limit the number of columns returned from either table.

Here is a sample of the code:

SELECT c.id, c.title, p.id AS product_id, p.title FROM categories AS c JOIN ( SELECT DISTINCT ON(p1.id) id, p1.title, p1.category_id FROM products p1 ) p ON (c.id = p.category_id)
The trick is not to join directly on the table with multiple occurrences of the id, rather, first create a table with only a single occurrence for each id

Icelandic answered 12/12, 2018 at 16:26 Comment(1)
Thanks for sharing. Unfortunately, MySQL 5.7 doesn't have many of the features that make this problem easy to solve in Postgres or SQL Server. MySQL 8 does, however, as it has the "WITH ... AS" syntax that can be applied as another answer suggests. If you find a way that would work for MySQL 5.7+, definitely let us know.Gaul
P
-3

Replace the tables with yours:

SELECT * FROM works w 
LEFT JOIN 
(SELECT photoPath, photoUrl, videoUrl FROM workmedias LIMIT 1) AS wm ON wm.idWork = w.idWork
Polymer answered 8/10, 2015 at 3:12 Comment(0)
B
-43

I would try something like this:

SELECT C.*,
      (SELECT P.id, P.title 
       FROM products as P
       WHERE P.category_id = C.id
       LIMIT 1)
FROM categories C
Bushwhack answered 29/7, 2011 at 22:1 Comment(2)
this throws error, Error Code: 1241. Operand should contain 1 column(s)Opportune
Question is on join.Copra

© 2022 - 2024 — McMap. All rights reserved.