replace NULL values with latest non-NULL value in resultset series (SQL Server 2008 R2)
Asked Answered
V

3

11

for SQL Server 2008 R2

I have a resultset that looks like this (note [price] is numeric, NULL below represents a NULL value, the result set is ordered by product_id and timestamp)

product timestamp          price 
------- ----------------   -----
   5678 2008-01-01 12:00   12.34
   5678 2008-01-01 12:01    NULL
   5678 2008-01-01 12:02    NULL
   5678 2008-01-01 12:03   23.45
   5678 2008-01-01 12:04    NULL

I want to transform that to a result set that (essentially) copies a non-null value from the latest preceding row, to produce a resultset that looks like this:

product timestamp          price  
------- ----------------   -----
   5678 2008-01-01 12:00   12.34
   5678 2008-01-01 12:01   12.34
   5678 2008-01-01 12:02   12.34
   5678 2008-01-01 12:03   23.45
   5678 2008-01-01 12:04   23.45

I don't find any aggregate/windowing function that will allow me to do this (again this ONLY needed for SQL Server 2008 R2.)

I was hoping to find an analytic aggregate function that do this for me, something like...

LAST_VALUE(price) OVER (PARTITION BY product_id ORDER BY timestamp)

But I don't seem to find any way to do a "cumulative latest non-null value" in the window (to bound the window to the preceding rows, rather than the entire partition)

Aside from creating a table-valued user defined function, is there any builtin that would accomplish this?


UPDATE:

Apparently, this functionality is available in the 'Denali' CTP, but not in SQL Server 2008 R2.

LAST_VALUE http://msdn.microsoft.com/en-us/library/hh231517%28v=SQL.110%29.aspx

I just expected it to be available in SQL Server 2008. It's available in Oracle (since 10gR2 at least), and I can do something similar in MySQL 5.1, using a local variable.

http://download.oracle.com/docs/cd/E14072_01/server.112/e10592/functions083.htm

Viewy answered 12/8, 2011 at 18:59 Comment(1)
Which function do you mean that's available in Denali, but not 2008 R2??Panpipe
F
13

You can try the following:

* Updated **

-- Test Data
DECLARE @YourTable TABLE(Product INT, Timestamp DATETIME, Price NUMERIC(16,4))

INSERT INTO @YourTable
SELECT 5678, '20080101 12:00:00', 12.34
UNION ALL
SELECT 5678, '20080101 12:01:00', NULL
UNION ALL
SELECT 5678, '20080101 12:02:00', NULL
UNION ALL
SELECT 5678, '20080101 12:03:00', 23.45
UNION ALL
SELECT 5678, '20080101 12:04:00', NULL

;WITH CTE AS
(
    SELECT *
    FROM @YourTable
)

-- Query
SELECT A.Product, A.Timestamp, ISNULL(A.Price,B.Price) Price
FROM CTE A
OUTER APPLY (   SELECT TOP 1 *
                FROM CTE 
                WHERE Product = A.Product AND Timestamp < A.Timestamp
                AND Price IS NOT NULL
                ORDER BY Product, Timestamp DESC) B

--Results
Product Timestamp   Price
5678    2008-01-01 12:00:00.000 12.3400
5678    2008-01-01 12:01:00.000 12.3400
5678    2008-01-01 12:02:00.000 12.3400
5678    2008-01-01 12:03:00.000 23.4500
5678    2008-01-01 12:04:00.000 23.4500
Fit answered 12/8, 2011 at 19:12 Comment(12)
That APPLY keyword is new to me. I will give this a shot. Thanks!Viewy
@Viewy - Also, you can find some documentation on the APPLY operator on this link: technet.microsoft.com/en-us/library/ms175156.aspxFit
The row source is more complicated than @YourTable... i'd rather only specify the row source once. Could this be rewritten using a common table expression WITH cte AS (SELECT * FROM @YourTable) SELECT ... Viewy
@Viewy - Ok, I change the query so it uses a CTEFit
The results look promising, it looks like it's working for me!Viewy
Hi, i have the same problem and was wondering if i use left join instead of outer apply in mysql will i get the same result?Pandit
@sleepingdragon90731 you can't just change it for a left join, it would be more like a correlated subquery.Fit
can you help how to do that so that i can get the same result as what you did when using outer apply?Pandit
@sleepingdragon90731 not without knowing the details of your problem, you might consider asking a question with all the information. Nevertheless, it should be something like SELECT col1, col2, (SELECT TOP 1 coln FROM AnotherTable WHERE col = A.col ORDER BY something) FROM YourTable AFit
I almost have the same problem as mentioned above but in mysql. But i will try your suggestion. thank youPandit
actually here is the question that i created yesterday #28057128Pandit
How to improve performance? I had about 650,000 rows to process and it took me more than 3 minutes.Uneven
U
8

I have a table containing the following data. I want to update all nulls in salary columns with previous value without taking null value.

Table:

id  name    salary
1   A       4000
2   B   
3   C   
4   C   
5   D       2000
6   E   
7   E   
8   F       1000
9   G       2000
10  G       3000
11  G       5000
12  G   

here is the query that works for me.

select a.*,first_value(a.salary)over(partition by a.value order by a.id) as abc from
(
     select *,sum(case when salary is null then 0 else 1 end)over(order by id) as value from test)a

output:

id  name    salary  Value   abc
1   A       4000    1     4000
2   B               1     4000
3   C               1     4000
4   C               1     4000
5   D       2000    2     2000
6   E               2     2000
7   E               2     2000
8   F       1000    3     1000
9   G       2000    4     2000
10  G       3000    5     3000
11  G       5000    6     5000
12  G               6     5000
Upthrust answered 9/6, 2017 at 7:16 Comment(0)
P
4

Try this:

;WITH SortedData AS
(
    SELECT
       ProductID, TimeStamp, Price,
       ROW_NUMBER() OVER(PARTITION BY ProductID ORDER BY TimeStamp DESC) AS 'RowNum'
    FROM dbo.YourTable
)
UPDATE SortedData
SET Price = (SELECT TOP 1 Price 
             FROM SortedData sd2
         WHERE sd2.RowNum > SortedData.RowNum 
           AND sd2.Price IS NOT NULL)
WHERE
    SortedData.Price IS NULL

Basically, the CTE creates a list sorted by timestamp (descending) - the newest first. Whenever a NULL is found, the next row that contains a NOT NULL price will be found and that value is used to update the row with the NULL price.

Panpipe answered 12/8, 2011 at 19:19 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.