fatkodima/sidekiq-iteration

Can I use sidekiq-iteration in the job that read remote file?

Closed this issue · 4 comments

# frozen_string_literal: true

require "open-uri"

class BulkOperationDataRetrieveJob
  include Sidekiq::Job

  sidekiq_options queue: :bulk_operation_data_retrieve, retry: false

  def perform(shop_domain, url)
    shop = Shop.find_by(shopify_domain: shop_domain)

    if shop.nil?
      logger.error("#{self.class} failed: cannot find shop with domain '#{shop_domain}'")
      return
    end

    read_file(url)
  end

  private
    def read_file(url)
      file_path = "tmp/customers.jsonl"
      IO.copy_stream(URI.open(url), file_path)

      # Parse data file
      File.open(file_path) do |f|
        f.each do |line|
          process_line(JSON.parse(line))
        end
      end
    end

    def process_line(line)
      shopify_customer_id = line["id"].gsub("gid://shopify/Customer/", "").to_i
      shop.shopify_customers.find_or_create_by(shopify_id: shopify_customer_id) do |customer|
        customer.email = line["email"]
        customer.phone = line["phone"]
        customer.amount_spent = line["amountSpent"]["amount"].to_f
      end
    end
end

I have the above job and the remote file contains 100K customers.
Can I use sidekiq-iteration in this job?

Sure, but you need to figure out how to write a custom cursor for this (https://github.com/fatkodima/sidekiq-iteration/blob/master/guides/custom-enumerator.md), which is a hard part. One of the (dumb?) solutions is to download the file locally, parse it, push some ids into the redis list, write a custom redis enumerator to iterate over this list and use this enumerator in the job.

There was a similar discussion (Shopify/job-iteration#50) in the parent gem before.

Sorry for the late reply.

Got it. But Sidekiq jobs run on Heroku and there's no way to guarantee that downloaded files (for example tmp/files) would exist.

I fixed this by building a custom iterator. Thank you!

@remy727 Can you please share the approach you finally decided to use or the iterator's code? So this would be helpful for future seekers or maybe be incorporated into the gem in the future.