Optimize a query using select/fby twice in q kdb
Asked Answered
M

2

6

I'm solving the classic problem of fby to find the max price per symbol from a trade table.
table: tr

time                          sym  src price size
-------------------------------------------------
2019.03.11D09:00:00.277000000 GOOG L   36.01 1427
2019.03.11D09:00:04.123000000 GOOG O   36.01 708
2019.03.11D09:00:08.123000000 MSFT N   35.5  7810
2019.03.11D09:00:10.123000000 MSFT O   31.1  1100

when I apply fby:

select from tr where price=(max;price) fby sym

Output is:

time                          sym  src price size
-------------------------------------------------
2019.03.11D09:00:00.277000000 GOOG L   36.01 1427
2019.03.11D09:00:04.123000000 GOOG O   36.01 708
2019.03.11D09:00:08.123000000 MSFT N   35.5  7810

But, as we can see I'm getting sym GOOG trade twice, since max price is same. Hence now I want to get the output per sym with last traded time of each symbol(along with max price). So, I use below query

select from (select from tr where price=(max;price) fby sym) where time=(last;time) fby sym

to get the ouput:

time                          sym  src price size
-------------------------------------------------
2019.03.11D09:00:04.123000000 GOOG O   36.01 708
2019.03.11D09:00:08.123000000 MSFT N   35.5  7810

Is there any better/optimized way to write above query which is not using select/fby twice?

Manrope answered 5/6, 2019 at 4:58 Comment(0)
P
6

You can use fby twice in your where clauses. Also consider the fact that where clauses are cascading so if you order them correctly, you'll get your desired results:

q)t:([]time:09:00 09:04 09:08 09:10;sym:`g`g`m`m;price:36.01 36.01 35.5 31.01)
q)select from t where price=(max;price) fby sym,time=(max;time) fby sym
time  sym price
---------------
09:04 g   36.01
09:08 m   35.5
Parrot answered 5/6, 2019 at 7:8 Comment(1)
Can we use fby in such a way that its mandatorily unequal ? I have a column which I wanted mandatorily different to be considered during aggregation.Mesquite
K
1

Slight optimisation in that second where clause will operate on the reduced set of the first constraint:

select from tr where price=(max;price) fby sym, time=(last;time) fby sym

Otherwise (and haven't timed it) "select by" has the same behaviour and is probably faster:

select by sym from tr where price=(max;price) fby sym
Kohlrabi answered 5/6, 2019 at 6:20 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.