I have create a sample dataset employee.txt which is in .zip folder. I have used pandas Lib to read the zipped compressed txt file. Might be there would be multiple approach but this is the best approach.
Records:employee.txt
Name;dept;age
Ravi kumar;Data Science;29
Amitesh Kumar;QA;29
Rohit Kumar;Sales;29
Ahimanyu;java;29
# import required modules
import zipfile
import pandas as pd
# read the dataset using the compression zip
pdf = pd.read_csv(r'C:\Users\ravi\Documents\pyspark test\dataset\employee.zip',compression='zip', sep=';')
# creating spark session and coverting pandas dataframe to spark datafram
from pyspark.sql import SparkSession
spark=SparkSession.builder.appName("zip reader").getOrCreate()
sparkDF=spark.createDataFrame(pdf)
print(sparkDF.show())
#mysql connection details
driver = "com.mysql.jdbc.Driver"
url = "jdbc:mysql://127.0.0.1:3306/test"
user = "root"
pwd = "India@123"
#writing final output to RDMS
sparkDF.write.format("jdbc").option("driver", driver)\
.option("url", url)\
.option("dbtable", "employee")\
.option("user", user)\
.option("password", pwd)\
.save()
Final Output:
+-------------+------------+---+
| Name| dept|age|
+-------------+------------+---+
| Ravi kumar|Data Science| 29|
|Amitesh Kumar| QA| 29|
| Rohit Kumar| Sales| 29|
| Ahimanyu| java| 29|
+-------------+------------+---+