MySQL: Select DISTINCT / UNIQUE, but return all columns?

Asked 25/5, 2011 at 15:55 Answered 22/11, 2022 at 6:17

489

SELECT DISTINCT field1, field2, field3, ......
FROM table;

I am trying to accomplish the following SQL statement, but I want it to return all columns.
Is this possible?

Something like this:

SELECT DISTINCT field1, * 
FROM table;

Commonplace answered 25/5, 2011 at 15:55 Comment(7)

Why doesn't SELECT DISTINCT * FROM table does not work for you? – Infantryman 25/5, 2011 at 15:57

If your table has a PK all rows should be distinct by definition. If you are trying to just select DISTINCT field1 but somehow return all other columns what should happen for those columns that have more than one value for a particular field1 value? You would need to use GROUP BY and some sort of aggregation on the other columns for example. – Alonaalone 25/5, 2011 at 15:57

If you want repeated rows and not only distinct rows, remove the distinct key word. – Roa 25/5, 2011 at 15:57

Could you give an example of what you expect the results to look like? So far, I can't make any sense of your desired query. – Placable 25/5, 2011 at 16:4

I only want the field 1 to be distinct, DISTINCT * won't work – Commonplace 25/5, 2011 at 23:20

Here is the answer of similar question asked, you need to first get the distinct column with their ids and then join it with the original table. SELECT DISTINCT on one column, return multiple other columns – Berserker 21/4, 2013 at 9:40

Which dbms are you using? So many contradictory/confusing answers below. (Answers to questions tagged with SQL should use ISO/IEC standard SQL.) – Capsule 8/3, 2018 at 13:42

491

You're looking for a group by:

select *
from table
group by field1

Which can occasionally be written with a distinct on statement:

select distinct on field1 *
from table

On most platforms, however, neither of the above will work because the behavior on the other columns is unspecified. (The first works in MySQL, if that's what you're using.)

You could fetch the distinct fields and stick to picking a single arbitrary row each time.

On some platforms (e.g. PostgreSQL, Oracle, T-SQL) this can be done directly using window functions:

select *
from (
   select *,
          row_number() over (partition by field1 order by field2) as row_number
   from table
   ) as rows
where row_number = 1

On others (MySQL, SQLite), you'll need to write subqueries that will make you join the entire table with itself (example), so not recommended.

Springs answered 25/5, 2011 at 16:5 Comment(11)

I think you forgot an alias row_number() over (partition by field1) row_number – Barajas 25/5, 2011 at 16:17

The query won't parse for me and gives an error: The ranking function "row_number" must have an ORDER BY clause. We need to add order by clause after partition by field1. So the correct query will be

select * from (    select *,           row_number() over (partition by field1 order by orderbyFieldName) as row_number    from table    ) as rows where row_number = 1

– Kempe 27/11, 2012 at 6:23

Thanks! I was in the same problem and the solution was the GROUP BY – Adsorbate 13/11, 2015 at 16:46

Also in Oracle (Oracle SQL Developer) you can not specify select *, row_number() over (partition by field1 order by field2) as row_number from table. You have to explicitly use table name/alias in select query select **table**.*, row_number() over (partition by field1 order by field2) as row_number from table – Zebrass 8/2, 2017 at 12:22

"Answers to questions tagged with SQL should use ISO/IEC standard SQL." – Capsule 8/3, 2018 at 13:43

@jarlh: Might be ... today. As you may notice, this answer is almost 7 years old, a point in time where that wasn't the case insofar as I can recollect from back when I was active. You're welcome to retag and/or edit the answer if you feel it's necessary. – Springs 8/3, 2018 at 16:13

select distinct on field1 * from table; This doesn't seem like correct syntax for mysql - at least no anymore. – Shelving 18/5, 2018 at 10:49

select distinct on (field1) * from table; works also in PostgreSQL – Ghazi 14/6, 2019 at 14:2

Cannot group on fields selected with '*'. – Hedda 16/3, 2021 at 18:20

@ChilianuBogdan thank you, you are a life saver! This should be the answer right here. Very compact. – Berberidaceous 25/3, 2021 at 0:6

doesnt work in mysql5.5, gives: Expression #1 of SELECT list is not in GROUP BY clause and contains nonaggregated column – Sartorial 2/11, 2021 at 14:4

From the phrasing of your question, I understand that you want to select the distinct values for a given field and for each such value to have all the other column values in the same row listed. Most DBMSs will not allow this with neither DISTINCT nor GROUP BY, because the result is not determined.

Think of it like this: if your field1 occurs more than once, what value of field2 will be listed (given that you have the same value for field1 in two rows but two distinct values of field2 in those two rows).

You can however use aggregate functions (explicitely for every field that you want to be shown) and using a GROUP BY instead of DISTINCT:

SELECT field1, MAX(field2), COUNT(field3), SUM(field4), ....
FROM table GROUP BY field1

Literality answered 25/5, 2011 at 16:13 Comment(4)

+1 for this solution. So we can do SELECT field1, MIN(field2), MIN(field3), MIN(field4), .... FROM table GROUP BY field1, and field2, 3, 4,,, are not required to be integers (or other digits), they can be char fields as well – Penstemon 21/7, 2015 at 13:51

Was working nicely until I got stuck at a boolean column. MIN(Dynamic) column values get modified to false even if it was true.. Any other aggregate function available to address boolean – signonsridhar 6 mins ago. Sum(dynamic) changed false to 1 – Argeliaargent 22/8, 2016 at 2:0

Great suggestion, led me to my solution which I think is more universal -- take a look! – Reciprocal 16/12, 2016 at 22:4

@Argeliaargent cast your boolean to an int and use sum; e.g. sum(cast(COL as int)) > 0 – Calve 8/5, 2018 at 15:17

If I understood your problem correctly, it's similar to one I just had. You want to be able limit the usability of DISTINCT to a specified field, rather than applying it to all the data.

If you use GROUP BY without an aggregate function, which ever field you GROUP BY will be your DISTINCT filed.

If you make your query:

SELECT * from table GROUP BY field1;

It will show all your results based on a single instance of field1.

For example, if you have a table with name, address and city. A single person has multiple addresses recorded, but you just want a single address for the person, you can query as follows:

SELECT * FROM persons GROUP BY name;

The result will be that only one instance of that name will appear with its address, and the other one will be omitted from the resulting table. Caution: if your fileds have atomic values such as firstName, lastName you want to group by both.

SELECT * FROM persons GROUP BY lastName, firstName;

because if two people have the same last name and you only group by lastName, one of those persons will be omitted from the results. You need to keep those things into consideration. Hope this helps.

Work answered 14/5, 2013 at 17:43 Comment(1)

As mentioned in the accepted answer, would work for most incarnations of SQL -- only for MYSQL – Reciprocal 16/12, 2016 at 22:3

That's a really good question. I have read some useful answers here already, but probably I can add a more precise explanation.

Reducing the number of query results with a GROUP BY statement is easy as long as you don't query additional information. Let's assume you got the following table 'locations'.

--country-- --city--
 France      Lyon
 Poland      Krakow
 France      Paris
 France      Marseille
 Italy       Milano

Now the query

SELECT country FROM locations
GROUP BY country

will result in:

--country--
 France
 Poland
 Italy

However, the following query

SELECT country, city FROM locations
GROUP BY country

...throws an error in MS SQL, because how could your computer know which of the three French cities "Lyon", "Paris" or "Marseille" you want to read in the field to the right of "France"?

In order to correct the second query, you must add this information. One way to do this is to use the functions MAX() or MIN(), selecting the biggest or smallest value among all candidates. MAX() and MIN() are not only applicable to numeric values, but also compare the alphabetical order of string values.

SELECT country, MAX(city) FROM locations
GROUP BY country

will result in:

--country-- --city--
 France      Paris
 Poland      Krakow
 Italy       Milano

or:

SELECT country, MIN(city) FROM locations
GROUP BY country

will result in:

--country-- --city--
 France      Lyon
 Poland      Krakow
 Italy       Milano

These functions are a good solution as long as you are fine with selecting your value from the either ends of the alphabetical (or numeric) order. But what if this is not the case? Let us assume that you need a value with a certain characteristic, e.g. starting with the letter 'M'. Now things get complicated.

The only solution I could find so far is to put your whole query into a subquery, and to construct the additional column outside of it by hands:

SELECT
     countrylist.*,
     (SELECT TOP 1 city
     FROM locations
     WHERE
          country = countrylist.country
          AND city like 'M%'
     )
FROM
(SELECT country FROM locations
GROUP BY country) countrylist

will result in:

--country-- --city--
 France      Marseille
 Poland      NULL
 Italy       Milano

Aluino answered 24/10, 2017 at 16:49 Comment(0)

SELECT  c2.field1 ,
        field2
FROM    (SELECT DISTINCT
                field1
         FROM   dbo.TABLE AS C
        ) AS c1
        JOIN dbo.TABLE AS c2 ON c1.field1 = c2.field1

Flibbertigibbet answered 25/5, 2011 at 15:58 Comment(3)

Why there is C alias when it can work without it? in line FROM dbo.TABLE AS C – Thalamus 2/10, 2017 at 10:53

I believe this is due to my use of RedGate SQLPrompt. The way I have it configured, it always adds aliases - even if unnecessary. It's there "just in case" – Flibbertigibbet 13/3, 2018 at 3:50

This looked promising for me but it still brought back all the rows, not the distinct field1. :( – Makeshift 7/6, 2020 at 11:19

Great question @aryaxt -- you can tell it was a great question because you asked it 5 years ago and I stumbled upon it today trying to find the answer!

I just tried to edit the accepted answer to include this, but in case my edit does not make it in:

If your table was not that large, and assuming your primary key was an auto-incrementing integer you could do something like this:

SELECT 
  table.*
FROM table
--be able to take out dupes later
LEFT JOIN (
  SELECT field, MAX(id) as id
  FROM table
  GROUP BY field
) as noDupes on noDupes.id = table.id
WHERE
  //this will result in only the last instance being seen
  noDupes.id is not NULL

Reciprocal answered 16/12, 2016 at 22:14 Comment(0)

Try

SELECT table.* FROM table 
WHERE otherField = 'otherValue'
GROUP BY table.fieldWantedToBeDistinct
limit x

Stringfellow answered 31/1, 2019 at 10:57 Comment(0)

You can do it with a WITH clause.

For example:

WITH c AS (SELECT DISTINCT a, b, c FROM tableName)
SELECT * FROM tableName r, c WHERE c.rowid=r.rowid AND c.a=r.a AND c.b=r.b AND c.c=r.c

This also allows you to select only the rows selected in the WITH clauses query.

Candide answered 29/3, 2013 at 18:31 Comment(0)

For SQL Server you can use the dense_rank and additional windowing functions to get all rows AND columns with duplicated values on specified columns. Here is an example...

with t as (
    select col1 = 'a', col2 = 'b', col3 = 'c', other = 'r1' union all
    select col1 = 'c', col2 = 'b', col3 = 'a', other = 'r2' union all
    select col1 = 'a', col2 = 'b', col3 = 'c', other = 'r3' union all
    select col1 = 'a', col2 = 'b', col3 = 'c', other = 'r4' union all
    select col1 = 'c', col2 = 'b', col3 = 'a', other = 'r5' union all
    select col1 = 'a', col2 = 'a', col3 = 'a', other = 'r6'
), tdr as (
    select 
        *, 
        total_dr_rows = count(*) over(partition by dr)
    from (
        select 
            *, 
            dr = dense_rank() over(order by col1, col2, col3),
            dr_rn = row_number() over(partition by col1, col2, col3 order by other)
        from 
            t
    ) x
)

select * from tdr where total_dr_rows > 1

This is taking a row count for each distinct combination of col1, col2, and col3.

Atomy answered 2/4, 2015 at 16:26 Comment(1)

too complicated and specific to one implementation of SQL – Reciprocal 16/12, 2016 at 22:1

select min(table.id), table.column1
from table 
group by table.column1

Infirm answered 19/10, 2019 at 18:28 Comment(2)

This worked for me!! It's worth noting tho, if you are using fetch_array() then you will need to call each row via an index label rather than implicitly calling the row name. There aren't enough characters in this for me to write out the example I have :X sorry!! – Transparent 3/6, 2020 at 17:32

i think this only works in mysql, not postgresql – Berberidaceous 25/3, 2021 at 0:4

SELECT *
FROM tblname
GROUP BY duplicate_values
ORDER BY ex.VISITED_ON DESC
LIMIT 0 , 30

in ORDER BY i have just put example here, you can also add ID field in this

Necking answered 16/10, 2013 at 12:42 Comment(1)

As mentioned in the accepted answer, would work for most incarnations of SQL -- only for MYSQL – Reciprocal 16/12, 2016 at 22:1

Found this elsewhere here but this is a simple solution that works:

 WITH cte AS /* Declaring a new table named 'cte' to be a clone of your table */
 (SELECT *, ROW_NUMBER() OVER (PARTITION BY id ORDER BY val1 DESC) AS rn
 FROM MyTable /* Selecting only unique values based on the "id" field */
 )
 SELECT * /* Here you can specify several columns to retrieve */
 FROM cte
 WHERE rn = 1

Makeshift answered 7/6, 2020 at 11:32 Comment(1)

Works for MSSQL – Makeshift 7/6, 2020 at 11:32

In this way can get 2 unique column with 1 query only select Distinct col1,col2 from '{path}' group by col1,col2 you can increase your columns if need

Adaiha answered 22/11, 2022 at 6:17 Comment(0)

-1

Add GROUP BY to field you want to check for duplicates your query may look like

SELECT field1, field2, field3, ......   FROM table GROUP BY field1

field1 will be checked to exclude duplicate records

or you may query like

SELECT *  FROM table GROUP BY field1

duplicate records of field1 are excluded from SELECT

Dannie answered 5/10, 2014 at 23:12 Comment(1)

GROUP BY clause must match selected fields. else it will throw error like filed2 must appear in the GROUP BY clause or be used in an aggregate function – Bathrobe 5/2, 2016 at 13:14

-2

Just include all of your fields in the GROUP BY clause.

Pecuniary answered 24/6, 2013 at 21:32 Comment(1)

To make this a good answer, you should include a little more detail about what you mean. – Obloquy 24/6, 2013 at 21:53

-2

It can be done by inner query

$query = "SELECT * 
            FROM (SELECT field
                FROM table
                ORDER BY id DESC) as rows               
            GROUP BY field";

Ionization answered 25/11, 2016 at 15:36 Comment(1)

This does not answer the question, the OP was trying to get all the data of the table but remove rows containing duplicates of a single field – Reciprocal 16/12, 2016 at 21:59

-4

SELECT * from table where field in (SELECT distinct field from table)

Cons answered 20/7, 2012 at 14:0 Comment(1)

That won't do the job. You have selected the distinct column in the subquery but the where clause gets all those columns with that value. So the query is as good as writing 'select * from table' unless 'field' column is a unique column in which case the distinct on that column isn't required at all. – Kempe 27/11, 2012 at 6:8

-4

SELECT DISTINCT FIELD1, FIELD2, FIELD3 FROM TABLE1 works if the values of all three columns are unique in the table.

If, for example, you have multiple identical values for first name, but the last name and other information in the selected columns is different, the record will be included in the result set.

Inceptive answered 6/8, 2014 at 15:45 Comment(1)

This does not answer the question, the OP was trying to get all the data of the table but remove rows containing duplicates of a single field – Reciprocal 16/12, 2016 at 22:0

-5

I would suggest using

SELECT  * from table where field1 in 
(
  select distinct field1 from table
)

this way if you have the same value in field1 across multiple rows, all the records will be returned.

Ratline answered 25/1, 2018 at 19:10 Comment(2)

It is not different with SELECT * FROM table;. Even more It is slow. – Giacopo 26/8, 2018 at 17:1

Please, try your answer first. – Maplemaples 25/9, 2018 at 5:57

Hot tags

Godot Unity Godot Help Programming Godot 4.X GUI GDScript 3D 2D Physics CSharp Godot 3.X VR XR Projects C++

Recommended topics

Hot tags