Spark dataframe add a row for every existing row - McMap

About

Spark dataframe add a row for every existing row

Asked 10/7, 2017 at 3:19 Answered 10/7, 2017 at 3:33

Solved scala apache-spark apache-spark-sql explode

E

1

6

I have a dataframe with following columns:

groupid,unit,height
----------------------
1,in,55
2,in,54

I want to create another dataframe with additional rows where unit=cm and height=height*2.54.

Resulting dataframe:

groupid,unit,height
----------------------
1,in,55
2,in,54
1,cm,139.7
2,cm,137.16

Not sure how I can use spark udf and explode here. Any help is appreciated. Thanks in advance.

Enabling answered 10/7, 2017 at 3:19 Comment(0)

A

11

you can create another dataframe with changes you require using withColumn and then union both dataframes as

import sqlContext.implicits._
import org.apache.spark.sql.functions._

val df = Seq(
  (1, "in", 55),
  (2, "in", 54)
).toDF("groupid", "unit", "height")

val df2 = df.withColumn("unit", lit("cm")).withColumn("height", col("height")*2.54)

df.union(df2).show(false)

you should have

+-------+----+------+
|groupid|unit|height|
+-------+----+------+
|1      |in  |55.0  |
|2      |in  |54.0  |
|1      |cm  |139.7 |
|2      |cm  |137.16|
+-------+----+------+

Alodee answered 10/7, 2017 at 3:33 Comment(0)

Recommended topics

#Godot #Unity #Godot 4.X #Mongodb

Hot tags

Godot Unity Godot Help Programming Godot 4.X GUI GDScript 3D 2D Physics CSharp Godot 3.X VR XR Projects C++

© 2022 - 2024 — McMap. All rights reserved.