This is my dataframe I'm trying to drop the duplicate columns with same name using index:
df = spark.createDataFrame([(1,2,3,4,5)],['c','b','a','a','b'])
df.show()
Output:
+---+---+---+---+---+
| c| b| a| a| b|
+---+---+---+---+---+
| 1| 2| 3| 4| 5|
+---+---+---+---+---+
I got the index of the dataframe
col_dict = {x: col for x, col in enumerate(df.columns)}
col_dict
Output:
{0: 'c', 1: 'b', 2: 'a', 3: 'a', 4: 'b'}
Now i need to drop that duplicate column name with the same name