KDB; stripping characters from column of symbols
Asked Answered
H

5

7

Any ideas how to do this nicely in KDB?

Consider the table

X:([]a:1 2 3;b:`abc11`abc22`abc33;c:10 20 30)

I now want a new table Y that has the "abc" stripped out of the symbols in the second column, such that:

Y:([]a:1 2 3;b:`11`22`33;c:10 20 30)
Hyperpituitarism answered 21/4, 2014 at 17:40 Comment(0)
F
9
q)t:([]a:1 2 3;b:`abc11`abc22`abc33;c:10 20 30)
q)t
a b     c
----------
1 abc11 10
2 abc22 20
3 abc33 30

Strip the first three characters:

q)update `$3_'string b from t
a b    c
---------
1 11 10
2 22 20
3 33 30

Or search and replace:

q)update `$ssr[;"abc";""] each string b from t
a b    c
---------
1 11 10
2 22 20
3 33 30

If the table is large and has many repeating items, consider using .Q.fu:

q)t:1000000#([]a:1 2 3;b:`abc11`abc22`abc33;c:10 20 30)
q)\t r1:update `$3_'string b from t
111
q)\t r2:update .Q.fu[{`$3_'string x};b] from t
5
q)r1~r2
1b
Fearnought answered 21/4, 2014 at 18:32 Comment(2)
don't forget to cast back to `symbol.Oldfashioned
nice tip on using .Q.fuBibliographer
O
1

Using lambdas as suggested in the other answer is fine, but not necessary for this question. If something can be achieved using adverbs alone, that is usually preferable.

Y: update `$ 3_'string b from X

is shorter and more readable. On my machine it also runs twice as fast.

Y: update `$ ssr[;"abc";""] each string b from X

only runs slightly faster. I've cast both back to `symbol.

Oldfashioned answered 21/4, 2014 at 23:28 Comment(0)
L
1

To add on to @Ryan Hamilton's answer, here is an example if you want to strip the last two characters:

q)update `$-2_'string b from t
a b    c
---------
1 abc 10
2 abc 20
3 abc 30
Longeron answered 24/1, 2020 at 19:49 Comment(0)
C
0

I tried both mentioned solution (by ryan and mollmerx) on large database(100000 rows) and I got following results:

ssr took around 645ms (average)

cut version took around 40 ms (average)

I tried another solution which proved to be much faster. If we use dictionary behavior of a table:

t[`b]:  `$3_'string t`b

This took around 12 ms on an average.

Chibchan answered 22/4, 2014 at 5:29 Comment(0)
M
0

If your situation is more general and you want to remove everything before your first numeric then you could do something like below. You could further generalize to all columns with symbol type

q)show r1:update b:`${(first each where each x in .Q.n)_'x}string b from X
a b  c
-------
1 11 10
2 22 20
3 33 30
q)show r2:@[X; exec c from meta X where t="s"; `${(first each where each x in .Q.n)_'x}string@]
a b  c
-------
1 11 10
2 22 20
3 33 30
q)r1~r2
1b

Granted, I don't know what your set up is, but on my machine, this is quite quick. Not nearly as quick as your dictionary approach, but might be worth the sacrifice for the generality

q)\ts update b:`${(first each where each x in .Q.n)_'x}string b from X
69 12746848j
Mallissa answered 22/4, 2014 at 12:32 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.