GeneralMills/pytrends

500 internal server errors

hussieneloy opened this issue Β· 48 comments

Hello,
I am using pytrends with its newest version 4.9.1 but since yesterday, I am getting too many 500 status code internal server errors responses. The error is happening with related_queries and related_topics methods.
It is not the case as it was few months ago with Google update when we were getting 429 status code errors that indicates too many requests. I tried it from different machines with different IPs but it remains the same.
Has this problem occurred to anybody else? I haven't seen other issues opened yet. Is is a temporary problem with Google servers or is it something new with their interface that causes such issues and so it would require an update to the scraping method?

it is the same for all the tools like this, with past days request (like now 1-d) :/

With me it also happened and even using a simple code, without many requests, the error appears "The request failed: Google returned a response with code 500".

Same here, it's been happening in the last few days...

Hey, same problem here when trying to retrieve recent data (past week), and it has been happening since yesterday

The same problem. The problem remained when using a proxy.

Hi, has anyone figured out the solution? Thanks

Same here :(

same problem...

Also looking for a solution..

Same happening with me

I have the same problem. Also with the npm package

This looks like a problem in the Google Trends backend, if that's the case we can do nothing about it.

Someone already traced that the website didn't really changed its format, the only difference is the dreaded USER_TYPE_SCRAPER vs USER_TYPE_LEGIT_USER: PMassicotte/gtrendsR#451 (comment)

Let's hope that it's just a problem in the Trends backend and not more throttling to scrapers.

It's not enough to have a valid NID cookie now. Google Trends now expects to submit POST request to https://trends.google.com/trends/api/explore with the reCAPTCHA token for the specific search parameters. After that, related_queries and related_topics are successfully retrieved.

Actual network request

image

The same issue did happen a months ago (ref).

image

Here's a part of the Google Trends JS code. (Changing window.enableRecaptcha and this.enableRecaptcha_ to false results into blocked requests in the browser.)

d.getExploreReport = function(a) {
  var b = this
    , c = {
    req: JSON.stringify(a),
    tz: this.configService_.userTimezoneOffset
  };
  return this.enableRecaptcha_ ? this.recaptchaService_.loadRecaptchaToken().then(function(e) {
    return b.http_.post(b.apiPathPrefix_, encodeURIComponent(e), {
      params: c
    })
  }) : this.http_.get(this.apiPathPrefix_, {
    params: c
  })
};

Same problem here.

As @Terseus mentioned, let's just hope this is an actual bug with Trends API and not a prevention for scrapers.

Otherwise, some Selenium logic would be very useful to solve this problem.

Same problem here.

As @Terseus mentioned, let's just hope this is an actual bug with Trends API and not prevention for scrapers.

Otherwise, some Selenium logic would be very useful to solve this problem.

@pepi99 , I tried to using Selenium, but I keep getting code 429 error when the webpage from google trends auto load. Did you get it to work.

@felicianomariom, Nope, I am just starting to work on this now. Will keep you updated.

@felicianomariom Have a look at this code:

https://github.com/Aman7818/Google-Trends/blob/main/Api/views.py

Based on that their code, you can add an additional click on the download button to download the data locally, then you can just load it with pandas and do whatever you want with it, here is the script to download the CSV:

import time
import random
from selenium.webdriver.common.by import By
import undetected_chromedriver as uc

search_term = 'term1, term2'

options = uc.ChromeOptions()
options.add_argument("--disable-extensions")
options.add_argument('--headless')
options.add_argument(
    "--user-agent=Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/113.0.5672.126 Safari/537.36")

driver = uc.Chrome(use_subprocess=True, options=options)

min_sleep_time = 1
max_sleep_time = 2

driver.get("https://trends.google.com/home?hl=en-US")
time.sleep(random.randint(min_sleep_time, max_sleep_time))

first_click = driver.find_element(By.CLASS_NAME, "VfPpkd-fmcmS-yrriRe.VfPpkd-fmcmS-yrriRe-OWXEXe-mWPk3d")
first_click.click()
first_click.send_keys(search_term)
time.sleep(random.randint(min_sleep_time, max_sleep_time))

explore_btn = driver.find_elements(By.CLASS_NAME, "UywwFc-LgbsSe.UywwFc-LgbsSe-OWXEXe-dgl2Hf.Qt4Qjb")
if len(explore_btn) > 0:
    explore_btn_1 = explore_btn[0].find_element(By.CLASS_NAME, "UywwFc-vQzf8d")
    explore_btn_1.click()

time.sleep(5)
duration_btn = driver.find_element(By.CSS_SELECTOR, "body > div.trends-wrapper > div:nth-child(2) > div > md-content > div > div > div:nth-child(1) > trends-widget > ng-include > widget > div > div > div > widget-actions > div > button.widget-actions-item.export")
duration_btn.click()

time.sleep(20)

driver.quit()

Make sure to also add additional logic for the region and the time. If you happen to do it before me, please paste your modified code here as well.

NOTE: Don't set the base URL to an already modified URL with search term/s, region and time, because if you are not going through the clicking process from the base URL, Google Trends will limit you after a couple of requests and you will get a 429 error.

Same here

pytrends.exceptions.ResponseError: The request failed: Google returned a response with code 500

+1

Started having problems yesterday and now every request responds with a server error.

I've been having this issue for about 5 days and all requests respond with server errors like above.

@pepi99 I'm trying your code, it works great so far on local, but when deployed on Google Cloud Run it never draws the actual trendline. I don't have a clue of what's going on... This is a screenshot taken from the actual process, as you can see, I can get as far as to change the date and everything... I know that I'm crossing a line here, but maybe you have an idea of what might be going on... Thanks

https://storage.googleapis.com/nmd_img/envios/30079ac5-25a0-4b8f-944b-dac7f5454a94.png

Any news?

send commented

It seems to work fine with timeframe more than 2 weeks ago like 2023-06-26T09 2023-06-27T09. But more recent timeframe does not.

@send I am running a collection from 2020 to 2020-06-15, but it doesn't work, so old timeframes are affected. You only have a day long window so I am curious why that makes it work for you. Conclusions from above seemed to say you needed to submit a new Post request. Maybe that isn't necessary for smaller data hauls?

https://storage.googleapis.com/nmd_img/envios/30079ac5-25a0-4b8f-944b-dac7f5454a94.png

@dtaubaso tbh, I am not sure what is the reason for that. I am running a modified version of the script I provided on a remote server and it seems to run just fine for now. If you provide me more details, I can try to help.

@pepi99 don't worry, I'm using Playwright and it works fine, problem is that it takes too long to retrieve the information for each keyword, I hope someone finds a fix for the API soon...

send commented

@RiiNagaja I tried a request the timeframe 2020-01-01 2020-06-15 and was able to get daily data of that period.
I don't know the reason, but it seems that the data that can be retrieved is limited by the format of timeframe.

Hello, is it therefore confirmed that it is a transient problem of Google Trends? or should we think of another way to retrieve weekly and daily data? (example: using Selenium)

@RiiNagaja I tried a request the timeframe 2020-01-01 2020-06-15 and was able to get daily data of that period. I don't know the reason, but it seems that the data that can be retrieved is limited by the format of timeframe.

Right, I forgot to say that I run a script that automatically subdivides timeframes so that hourly data can be retrieved. So it is only that part that isn't working, how strange.

Any news? It doesnt work with now 7-d timeframe

I just noticed that the response changed from 500 (Internal Server Error) to 429 (Too many requests) ....

I just noticed that the response changed from 500 (Internal Server Error) to 429 (Too many requests) ....

Yep, I'm getting 429 errors on nearly every request now.

Anyone else?

Yes, same here...

yeah the same in node.js lib (it has the same mechanism like this lib in python) :(.

same problem

Same here

same, any news ?

same issue

3 weeks without pytrends😒

Has anyone tried reaching out to Google?

Not in relation to the PyTrends library but rather that the API isn't working at all, including for the embedding widget.

If anyone wants to embed a Google Trends chart on their website or share it to socials it's missing data or just doesn't work at all.

This API issue is the cause for libraries such as PyTrends to return 429 & 500 errors, empty dataframes etc..

Screenshot 2023-07-25 at 3 43 02 PM Screenshot 2023-07-25 at 3 41 45 PM

I have reported this exact issue (embed feature) via gtrends site since 3 weeks ago, but no answer from Google yet..

I have reported this exact issue (embed feature) via gtrends site since 3 weeks ago, but no answer from Google yet..

Very atypical of them to allow an issue to persist for so long. Wonder whats going on behind the scenes

@pepi99 I am unable to access the link: https://github.com/Aman7818/Google-Trends/blob/main/Api/views.py you have dropped. Is this anyway for it to be accessible to the public?

Good news!! The API is back to life! Google also solved the GTrends embed issue ..

Seems to be working fine now, let's hope it lasts

Ok. Lets hope it remains that way. I think keeping the issue open serves no purpose now.