All the examples in the Databricks documentation are in Scala. Can't find how to use this trigger type from PySpark. Is there an equivalent API or workaround ?
Trigger.AvailableNow for Delta source streaming queries in PySpark (Databricks)
Asked Answered
Python implementation missed the Spark 3.2 release, so it will be included into Spark 3.3 only (for OSS version). On Databricks it was released as part of DBR 10.3 (or 10.2?), and could be used as following:
.trigger(availableNow=True)
Here is the official documentation:
DataStreamWriter.trigger(*, processingTime: Optional[str] = None,
once: Optional[bool] = None,
continuous: Optional[str] = None,
availableNow: Optional[bool] = None) -> pyspark.sql.streaming.DataStreamWriter
availableNow: bool, optional
if set to True, set a trigger that processes all available data in multiple >batches then terminates the query. Only one trigger can be set.
# trigger the query for reading all available data with multiple batches
writer = sdf.writeStream.trigger(availableNow=True)
© 2022 - 2024 — McMap. All rights reserved.