When using SparkML to predict labels the result Dataframe is:
scala> result.show
+-----------+--------------+
|probability|predictedLabel|
+-----------+--------------+
| [0.0,1.0]| 0.0|
| [0.0,1.0]| 0.0|
| [0.0,1.0]| 0.0|
| [0.0,1.0]| 0.0|
| [0.0,1.0]| 0.0|
| [0.1,0.9]| 0.0|
| [0.0,1.0]| 0.0|
| [0.0,1.0]| 0.0|
| [0.0,1.0]| 0.0|
| [0.0,1.0]| 0.0|
| [0.0,1.0]| 0.0|
| [0.0,1.0]| 0.0|
| [0.1,0.9]| 0.0|
| [0.6,0.4]| 1.0|
| [0.6,0.4]| 1.0|
| [1.0,0.0]| 1.0|
| [0.9,0.1]| 1.0|
| [0.9,0.1]| 1.0|
| [1.0,0.0]| 1.0|
| [1.0,0.0]| 1.0|
+-----------+--------------+
only showing top 20 rows
I want to create a new Dataframe with a new column named prob which is the first value from the Vector in probability column of original Dataframe e.g.:
+-----------+--------------+----------+
|probability|predictedLabel| prob |
+-----------+--------------+----------+
| [0.0,1.0]| 0.0| 0.0|
| [0.0,1.0]| 0.0| 0.0|
| [0.0,1.0]| 0.0| 0.0|
| [0.0,1.0]| 0.0| 0.0|
| [0.0,1.0]| 0.0| 0.0|
| [0.1,0.9]| 0.0| 0.0|
| [0.0,1.0]| 0.0| 0.0|
| [0.0,1.0]| 0.0| 0.0|
| [0.0,1.0]| 0.0| 0.0|
| [0.0,1.0]| 0.0| 0.0|
| [0.0,1.0]| 0.0| 0.0|
| [0.0,1.0]| 0.0| 0.0|
| [0.1,0.9]| 0.0| 0.1|
| [0.6,0.4]| 1.0| 0.6|
| [0.6,0.4]| 1.0| 0.6|
| [1.0,0.0]| 1.0| 1.0|
| [0.9,0.1]| 1.0| 0.9|
| [0.9,0.1]| 1.0| 0.9|
| [1.0,0.0]| 1.0| 1.0|
| [1.0,0.0]| 1.0| 1.0|
+-----------+--------------+----------+
How can extract this value into a new column?