Assume I have this dataframe
import polars as pl
df = pl.DataFrame({
'item': ['CASH', 'CHECK', 'DEBT', 'CHECK', 'CREDIT', 'CASH'],
'quantity': [100, -20, 0, 10, 0, 0],
'value': [99, 47, None, 90, None, 120],
'value_other': [97, 57, None, 91, None, 110],
'value_other2': [94, 37, None, 93, None, 115],
})
┌────────┬──────────┬───────┬─────────────┬──────────────┐
│ item ┆ quantity ┆ value ┆ value_other ┆ value_other2 │
│ --- ┆ --- ┆ --- ┆ --- ┆ --- │
│ str ┆ i64 ┆ i64 ┆ i64 ┆ i64 │
╞════════╪══════════╪═══════╪═════════════╪══════════════╡
│ CASH ┆ 100 ┆ 99 ┆ 97 ┆ 94 │
│ CHECK ┆ -20 ┆ 47 ┆ 57 ┆ 37 │
│ DEBT ┆ 0 ┆ null ┆ null ┆ null │
│ CHECK ┆ 10 ┆ 90 ┆ 91 ┆ 93 │
│ CREDIT ┆ 0 ┆ null ┆ null ┆ null │
│ CASH ┆ 0 ┆ 120 ┆ 110 ┆ 115 │
└────────┴──────────┴───────┴─────────────┴──────────────┘
Now I want to set all value columns to 0
for all rows where value is null
and quantity == 0
.
Right now I have this solution
cols = ['value', 'value_other', 'value_other2']
df = df.with_columns([
pl.when(pl.col('value').is_null() & (pl.col('quantity') == 0))
.then(0)
.otherwise(pl.col(col))
.alias(col)
for col in cols
])
which correctly gives
┌────────┬──────────┬───────┬─────────────┬──────────────┐
│ item ┆ quantity ┆ value ┆ value_other ┆ value_other2 │
│ --- ┆ --- ┆ --- ┆ --- ┆ --- │
│ str ┆ i64 ┆ i64 ┆ i64 ┆ i64 │
╞════════╪══════════╪═══════╪═════════════╪══════════════╡
│ CASH ┆ 100 ┆ 99 ┆ 97 ┆ 94 │
│ CHECK ┆ -20 ┆ 47 ┆ 57 ┆ 37 │
│ DEBT ┆ 0 ┆ 0 ┆ 0 ┆ 0 │
│ CHECK ┆ 10 ┆ 90 ┆ 91 ┆ 93 │
│ CREDIT ┆ 0 ┆ 0 ┆ 0 ┆ 0 │
│ CASH ┆ 0 ┆ 120 ┆ 110 ┆ 115 │
└────────┴──────────┴───────┴─────────────┴──────────────┘
However, I feel this is very inefficient as my when
condition is executed for every value column. Is there a way to achieve this using only polar internal functions & without the native for-loop?