I have another question that is related to the split function. I am new to Spark/Scala.
below is the sample data frame -
+-------------------+---------+
| VALUES|Delimiter|
+-------------------+---------+
| 50000.0#0#0#| #|
| [email protected]@| @|
| 1$| $|
|1000.00^Test_string| ^|
+-------------------+---------+
and I want the output to be -
+-------------------+---------+----------------------+
|VALUES |Delimiter|split_values |
+-------------------+---------+----------------------+
|50000.0#0#0# |# |[50000.0, 0, 0, ] |
|[email protected]@ |@ |[0, 1000.0, ] |
|1$ |$ |[1, ] |
|1000.00^Test_string|^ |[1000.00, Test_string]|
+-------------------+---------+----------------------+
I tried to split this manually -
dept.select(split(col("VALUES"),"#|@|\\$|\\^").show()
and the output is -
+-----------------------+
|split(VALUES,#|@|\$|\^)|
+-----------------------+
| [50000.0, 0, 0, ]|
| [0, 1000.0, ]|
| [1, ]|
| [1000.00, Test_st...|
+-----------------------+
But I want to pull up the delimiter automatically for a large dataset.