eddiewebb/circleci-queue

until_front_of_line doesn't block on Alpine images

dcelasun opened this issue · 6 comments

Perhaps I'm doing something wrong, but I can't get until_front_of_line to work. The job config looks like this:

deploy-step:
    docker:
      - image: alpine:3.9
    steps:
      - checkout
      - run:
          name: Install curl and jq
          command: |
            apk add --no-cache curl jq
      - queue/until_front_of_line:
          time: '10'
          only-on-branch: develop
          block-workflow: true
      - run:
          command: # some deploy command

Then I push two commits to develop with ~10 seconds between them. The log from the first job:

develop queueable
This build will block until all previous builds complete.
Max Queue Time: 10 minutes.
Only blocking execution if running previous jobs on branch: develop
Attempting to access CircleCI api. If the build process fails after this step, ensure your CIRCLECI_API_KEY is set.
API access successful
Orb parameter block-workflow is true.
This job will block until no previous workflows have *any* jobs running.
Oldest job: 
This Workflow Timestamp: "2019-09-04T12:33:17+01:00"
Oldest Workflow Timestamp: "2019-09-04T12:33:17+01:00"
Front of the line, WooHoo!, Build continuing

and the second one:

develop queueable
This build will block until all previous builds complete.
Max Queue Time: 10 minutes.
Only blocking execution if running previous jobs on branch: develop
Attempting to access CircleCI api. If the build process fails after this step, ensure your CIRCLECI_API_KEY is set.
API access successful
Orb parameter block-workflow is true.
This job will block until no previous workflows have *any* jobs running.
Oldest job: 
This Workflow Timestamp: "2019-09-04T12:33:27+01:00"
Oldest Workflow Timestamp: "2019-09-04T12:33:17+01:00"
Front of the line, WooHoo!, Build continuing

As you can see, oldest workflow timestamp is earlier than this workflow timestamp, but the build continues. Any ideas?

I had similar case last week. We have queuing in middle of workflow to block simultaneous deploying. We have lately added docker layer caching to build so first build did long image building and next one didn't.

Could see that later started workflow waited for 1 min 30 seconds but continued when other workflow got to queueing step. I believe this could be because before that we have 2 simultaneous jobs so at least one is running at time, and in queuing there is only one job.

Also reported bug to CircleCI that the endpoint used by this orb doesn't work as expected. Filtering doesn't work so it always returns all jobs. But then again I think it shouldn't affect this one...

I had some time to look at the queue loop which is where the bug would be occurring. Since there's absolutely no logic between the compare function and the output, I fired up a pair of docker containers w/ bash & alpine respectively.

#!/bin/bash
# test script (using /bin/ash in alpine)
my_commit_time="2019-09-04T12:33:27+01:00"
oldest_commit_time="2019-09-04T12:33:17+01:00"

if [[ "$oldest_commit_time" > "$my_commit_time" ]] || [[ "$oldest_commit_time" == "$my_commit_time" ]]; then
  # API returns Y-M-D HH:MM (with 24 hour clock) so alphabetical string compare is accurate to timestamp compare as well
  # in event of race, everyone wins
  echo "Front of the line, WooHoo!, Build continuing"
else
  echo "I would be queued"
fi

Output in a bash container:

$ ./test.sh
I would be queued

Output in an ash container:

/ # ./test.sh
Front of the line, WooHoo!, Build continuing

Unfortunately, any fixes are going to be really difficult, as /bin/ash doesn't allow for bashisms like [[ ... ]] and ==. I would recommend running against a non-alpine container, which should fix this issue.

Hey thanks @jakobo !

Some of our orbs will install bash if not present, but since this orbs job is just to block, I think using a non Alpine image is best.

I'll keep this open a bit so @dcelasun and @jarikujansuu can confirm they were using Alpine

Yes, I'm using Alpine images.

We don't have Alpine. And we actually use block_workflow job, which seems to use circleci/node:10 ...well not sure what it is based on.

@jarikujansuu - the job's default image is debian/bash, so would not be the same issue.

Your issue does sounds more like #26 which occurs if this job calls the API a the right time while the previous workflow has completed previous jobs, but not yet "running" the next.