oban-bg/oban

Oban.Pro.Workers.Batch not executing callbacks when there is only one job in the batch

Closed this issue · 1 comments

Precheck

I don't see an issue quite like this on the GitHub repo. I attempted to join the Elixir lang Slack so I could message in the #oban channel, but was unable to -- it seems like the Request Invitation app on Heroku is broken for now. Anyway, writing this here in the hopes that you'll have some insights into the behaviour I'm observing.

Environment

  • Oban Version: 2.11
  • Oban Pro Version: 1.3
  • PostgreSQL Version: 14
  • Elixir & Erlang/OTP Versions: Elixir 1.15.4, Erlang/OTP 26 [erts-14.0.2]

Current Behavior

I've set up the following batch Worker as an example:

defmodule Distru.Workers.BrokenBatch do
  use Oban.Pro.Workers.Batch, queue: :default

  require Logger

  @impl true
  def process(%Job{meta: %{"batch_id" => batch_id}} = job) do
    Logger.error("INSIDE PROCESS #{batch_id}")

    Logger.error("FINISH PROCESS #{batch_id}")

    {:ok, :success}
  end

  @impl Batch
  def handle_attempted(%Job{}) do
    Logger.error("INSIDE HANDLE ATTEMPTED")
    :ok
  end
end

Then, I'm using the following script to kick off a batch via iex:

run = fn ->
  require Logger

  oban_batch_id = Ecto.UUID.generate()

  Enum.map([1], fn _ -> %{} end)
  |> Distru.Workers.BrokenBatch.new_batch(batch_id: oban_batch_id)
  |> tap(fn jobs_in_batch ->
    Logger.error("Created #{length(jobs_in_batch)} jobs in batch #{oban_batch_id}")
  end)
  |> Oban.insert_all()
end

run.()

What I am seeing in the logs:

INSIDE PROCESS 982cddc5-828c-45d7-8119-88d2494799e5
FINISH PROCESS 982cddc5-828c-45d7-8119-88d2494799e5

Expected Behavior

What I thought I would see in the logs when we ran the above test script:

INSIDE PROCESS 982cddc5-828c-45d7-8119-88d2494799e5
FINISH PROCESS 982cddc5-828c-45d7-8119-88d2494799e5
INSIDE HANDLE ATTEMPTED

However, if I change the script to have two jobs in the batch like this:

run = fn ->
  require Logger

  oban_batch_id = Ecto.UUID.generate()

  Enum.map([1, 2], fn _ -> %{} end)
  |> Distru.Workers.BrokenBatch.new_batch(batch_id: oban_batch_id)
  |> tap(fn jobs_in_batch ->
    Logger.error("Created #{length(jobs_in_batch)} jobs in batch #{oban_batch_id}")
  end)
  |> Oban.insert_all()
end

run.()

Then, I am seeing what I thought I would in the logs:

INSIDE PROCESS 5e556824-af15-4e81-9879-b4c5458dc9a3
FINISH PROCESS 5e556824-af15-4e81-9879-b4c5458dc9a3
INSIDE PROCESS 5e556824-af15-4e81-9879-b4c5458dc9a3
FINISH PROCESS 5e556824-af15-4e81-9879-b4c5458dc9a3
INSIDE HANDLE ATTEMPTED

To me, it seems like we are only getting the handle_attempted callback if there is more than one job in the batch. Is this expected behaviour? It looks like I can code around this by checking the number of jobs in the batch via all_batch_jobs and then manually execute the callback(s) as needed like this:

# ...
    if(all_batch_jobs(job) |> Enum.count() < 2) do
      Logger.error("INSIDE PROCESS #{batch_id} WITH ONLY ONE JOB -- FIRING COMPLETE CALLBACK MANUALLY")
      handle_attempted(job)
    end
# ...

However, this feels wrong to me. Do you have an insight on this behaviour and/or the approach I should take to get around it so that the callback(s) execute at the end of the batch regardless of job count in the batch?

@antogon The move to async in Pro v1.3 unintentionally broke batch queries with a race condition between debouncing the batch and flushing the acks. That's fixed in v1.3.3, released yesterday.

Here's an invite link for the Elixir Slack (it's also available as the "Community" link on the getoban.pro site). Hopefully we can get the general request invitation app running again.