Caching technique has not scaled with ecosystem size
Closed this issue · 1 comments
djsegal commented
This code says every package that's been modified in the last 5 days is new. However, this leads to around 20x more hits per recently updated package.
Simple fix would be to read cache to check if the URL was from more than 5 days ago?
// the original purpose of this was that api hits with caching are tricky in the 1-2 days ago range
def is_new_response? url
return true \
unless Batch.active_marker_date.present?
url_response = HTTParty.get url, \
query: @client_info, \
headers: @url_headers.merge({
"If-Modified-Since" => 5.days.ago.rfc822
})
validate_response url, url_response
return false if url_response.code == 304
Rails.cache.write url, url_response.to_yaml
true
end