
Spark streaming AvailableNow trigger terminates after first batch

seb-emmot opened this issue · 1 comments

I am trying to build a spark streaming application to ingest data from Azure Event Hubs and persist to a delta table in databricks.
I'm using the AvailableNow trigger in spark streaming.
This trigger should process all data from the source in batches according to

Bug Report:

  • Actual behavior
    The stream start and processes first batch, then it terminates.
  • Expected behavior
    The stream start and processes all available data, in microbatches, then terminates
  • Spark version
  • spark-eventhubs artifactId and version

It seems like the support for the 'AvailableNow' trigger might not be implemented?

My code:

val connectionString = ConnectionStringBuilder(namespace_str)

val ehConf = EventHubsConf(connectionString)

val inStream = spark.readStream.format("eventhubs").options(ehConf.toMap).load()

val outStream = inStream.writeStream
  .option("checkpointLocation", checkpointLocation)

I have previously asked a question related to this on Stack Overflow (in Pyspark though)

Hi, I am facing the same issue. Is there any fix on this @yamin-msft @hmlam? If yes, by when will this feature be available?