I've successfully create a row_number()
and partitionBy()
by in Spark using Window, but would like to sort this by descending, instead of the default ascending. Here is my working code:
from pyspark import HiveContext
from pyspark.sql.types import *
from pyspark.sql import Row, functions as F
from pyspark.sql.window import Window
(
data_cooccur
.select(
"driver",
"also_item",
"unit_count",
F.rowNumber().over(
Window
.partitionBy("driver")
.orderBy("unit_count")
).alias("rowNum")
)
.show()
)
That gives me this result:
+------+---------+----------+------+
|driver|also_item|unit_count|rowNum|
+------+---------+----------+------+
| s10| s11| 1| 1|
| s10| s13| 1| 2|
| s10| s17| 1| 3|
+------+---------+----------+------+
And here I add the desc()
to order descending:
(
data_cooccur
.select(
"driver",
"also_item",
"unit_count",
F.rowNumber().over(
Window
.partitionBy("driver")
.orderBy("unit_count")
.desc()
).alias("rowNum")
)
.show()
)
And get this error:
> AttributeError: 'WindowSpec' object has no attribute 'desc'
What am I doing wrong here?
row_number
instead ofrowNumber
. – Philipphilipa