Pyspark Function to Negate Column
Asked Answered
D

2

5

Is there a built-in function to add a new column which is the negation of the original column?

Spark SQL has the function negative(). Pyspark does not seem to have inherited this function.

df_new = df.withColumn(negative("orginal"))
Directorate answered 21/8, 2019 at 8:41 Comment(1)
I think you are asking: ``` df_new = df.withColumn("new_column_name", [what put in here?]) ```Avery
L
9

Assuming your column original is boolean :

df_new = df.withColumn(~df["original"])  # Equivalent to "not original"
Lais answered 21/8, 2019 at 8:48 Comment(2)
Thanks Pierre, it looks like the '~' operator only works on Boolean Types. This operator is handy thoughDirectorate
Is there something pyspark native to do this? I think if I use ~ then it is going to be executed in a python process using memory overhead, not in the JVM. Is there an alternative that will be executed within the JVM only?Nicotiana
A
0

I think it should be this to be syntax right, based on @pierre-gourseaud's answer:

df_new = df.withColumn("new_column_name", ~df["original"])  # Equivalent to "not original"

Avery answered 24/5, 2023 at 23:12 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.