General Task
A table consists of three columns (time, key, value). The task is to calculate a running difference for each key. So, from input
---------------
| time | key | value |
---------------
| 1 | A | 4 |
| 2 | B | 1 |
| 3 | A | 6 |
| 4 | A | 7 |
| 5 | B | 3 |
| 6 | B | 7 |
it is desired to get
----------------------
| key | value | delta |
----------------------
| A | 4 | 0 |
| B | 1 | 0 |
| A | 6 | 2 |
| A | 7 | 1 |
| B | 3 | 2 |
| B | 7 | 4 |
Approaches
runningDifference
function. Works, if the key is fixed. So we canselect *, runningDifference(value) from (SELECT key, value from table where key = 'A' order by time)
Note that subquery is necessary here. This solution suffers when you want to get this for different keys
groupArray
.select key, groupArray(value) from (SELECT key, value from table order by time) group by key
So, now we get a key and an array of elements with this key. Good.
But how to calculate a sliding difference? If we could do that, then
ARRAY JOIN
would lead us to a result.Or we can even
zip
the array with itself and then apply lambda (we havearrayMap
for that) but... we don't have anyzip
alternative.
Any ideas? Thanks in advance.
time
column, meaning:| time | key | value | delta |
? – Hypochlorite