I'm using pyspark
to write on a kafka
broker, for that a JAAS security mechanism is set up thus we need to pass username and password as env variables
data_frame \
.selectExpr('CAST(id AS STRING) AS key', "to_json(struct(*)) AS value") \
.write \
.format('kafka') \
.option('topic', topic)\
.option('kafka.ssl.endpoint.identification.algorithm', 'https') \
.option('kafka.bootstrap.servers', os.environ['BOOTSTRAP_SERVER']) \
.option('kafka.sasl.jaas.config',
f'org.apache.kafka.common.security.plain.PlainLoginModule required username="{os.environ["USERNAME"]}" password="{os.environ["PASSWORD"]}";')\
.option('kafka.sasl.mechanism', 'PLAIN')\
.option('kafka.security.protocol', 'SASL_SSL')\
.mode('append') \
.save()
locally I used python
os.environ[""]
to retrieve environment variables, how to pass these last into AWS Glue Job ?
os.environ
thus it will be the same in the different levels ( need just to change the env variables in the server) ? – Mateya