realroy/keyword_manager

[Feature] Asynchronous processing of keywords

Closed this issue · 3 comments

Issue

Upon uploading keywords, the scraping will be performed synchronously in the controller:

def update
keywords = ExtractKeywordsFromFileService.new(file: uploads_params[:file], user: current_user).call
keywords.each { |keyword| ScrapeFromKeywordService.new(keyword:).call }
redirect_to keywords_path
rescue StandardError => e
p 'Error', e
flash[:error] = 'Something went wrong! Please try again.'
render 'show'
end

If a user uploads up to 100 keywords as per the requirement, the request can take a long time to complete (and potentially could exceed the HTTP request timeout). In addition, if any errors occur for any keyword, the processing of keywords will stop.

Expected

  • Keywords should be stored in the database with a flag to track the scraping status.
  • A distinct background job to scrap the Google page is then scheduled.

The benefits are:

  • Fast CSV upload request
  • The scraping of each keyword is isolated, i.e., one could error while others could be successful.

I totally agree. Now, scraping supports asynchronous execution with Sidekiq. Additionally, I've added a scrape status (pending, failed, and successful) to the keyword for tracking purposes.

image

Using Sidekiq was the correct technical call 👍 I have added a comment on the merged PR for your review.