how to get the distinct records based on maximum date?

B

8

12

I'm working with Sql server 2008.i have a table contains following columns,

    Id,
    Name,
    Date

this table contains more than one record for same id.i want to get distinct id having maximum date.how can i write sql query for this?

Bonesetter answered 28/4, 2014 at 8:56 Comment(1)

First show us what you have already tried please. – Jarita 28/4, 2014 at 8:58

C

22

Use the ROW_NUMBER() function and PARTITION BY clause. Something like this:

SELECT Id, Name, Date FROM (
    SELECT *, ROW_NUMBER() OVER (PARTITION BY Id ORDER BY Date desc) AS ROWNUM 
    FROM [MyTable]
) x WHERE ROWNUM = 1

Coffeehouse answered 28/4, 2014 at 8:59 Comment(2)

This helped me as well, thanks. What does the x represent before the WHERE statement? – Puckery 14/1, 2021 at 18:23

it's an alias for the subquery. See #4630479 – Coffeehouse 14/1, 2021 at 19:30

H

4

If you need only ID column and other columns are NOT required, then you don't need to go with ROW_NUMBER or MAX or anything else. You just do a Group By over ID column, because whatever the maximum date is you will get same ID.

SELECT ID FROM table GROUP BY ID
--OR
SELECT DISTINCT ID FROM table

If you need ID and Date columns with maximum date, then simply do a Group By on ID column and select the Max Date.

SELECT ID, Max(Date) AS Date
FROM table 
GROUP BY ID

If you need all the columns but 1 line having Max. date then you can go with ROW_NUMBER or MAX as mentioned in other answers.

SELECT *
FROM   table AS M
WHERE  Exists(
        SELECT 1
        FROM   table
        WHERE  ID = M.ID
        HAVING M.Date = Max(Date)
        )

Heisel answered 28/4, 2014 at 9:44 Comment(0)

P

2

One way, using ROW_NUMBER:

With CTE As
(
    SELECT Id, Name, Date, Rn = Row_Number() Over (Partition By Id
                                                   Order By Date DESC)
    FROM dbo.TableName
)
SELECT Id --, Name, Date 
FROM CTE
WHERE Rn = 1

If multiple max-dates are possible and you want all you could use DENSE_RANK instead.

Here's an overview of sql-server's ranking function: http://technet.microsoft.com/en-us/library/ms189798.aspx

By the way, CTE is a common-table-expression which is similar to a named sub-query. I'm using it to be able to filter by the row_number. This approach allows to select all columns if you want.

Philipphilipa answered 28/4, 2014 at 8:59 Comment(0)

M

1

select Max(Date) as "Max Date" from table group by Id order by Id

Menzies answered 19/6, 2019 at 15:54 Comment(0)

T

0

Try with Max(Date) and GROUP BY the other two columns (the ones with repeating data)..

SELECT ID, Max(Date) as date, Name
FROM YourTable
GROUP BY ID, Name

Tripedal answered 28/4, 2014 at 8:59 Comment(0)

S

0

You may try with this

DECLARE  @T TABLE(ID INT, NAME VARCHAR(50),DATE DATETIME) 
 INSERT INTO @T VALUES(1,'A','2014-04-20'),(1,'A','2014-04-28')
,(2,'A2','2014-04-22'),(2,'A2','2014-04-24')
,(3,'A3','2014-04-20'),(3,'A3','2014-04-28')
,(4,'A4','2014-04-28'),(4,'A4','2014-04-28')
,(5,'A5','2014-04-28'),(5,'A5','2014-04-28')

SELECT T.ID FROM @T T 
WHERE T.DATE=(SELECT MAX(A.DATE) 
              FROM @T A 
              WHERE A.ID=T.ID 
              GROUP BY A.ID ) 
GROUP BY T.ID

Sarasvati answered 28/4, 2014 at 9:15 Comment(0)

E

0

select id, max(date) from NameOfYourTable group by id;

Epidemiology answered 26/1, 2017 at 16:38 Comment(0)

G

0

The ROW_NUMBER() function and PARTITION BY clause worked for me thanks!! I needed the End_timestamp for each distinct assetName along with the rest of the records in the row. This example uses a sql statement in Python.

*Python (DataBricks Notebook)

*sql_statement = ''' SELECT assetName, EngModel, total_cnt, flagged_cnt,diff_EE_avg,flagged_pct,priority,Start_timestamp,End_timestamp FROM ( SELECT , ROW_NUMBER() OVER (PARTITION BY assetName ORDER BY End_timestamp desc) AS ROWNUM FROM ml_dev.ml_prediction_historical ) x WHERE ROWNUM = 1 and End_timestamp BETWEEN current_timestamp() - INTERVAL '36' hours AND current_timestamp() ''' df_telemetry = spark.sql(sql_statement) display(df_telemetry)

Output

Gudren answered 19/1, 2024 at 22:14 Comment(0)

Recommended topics

Hot tags