This is not an officially supported Google product. It is a reference implementation.
This tool is a Python-based solution that aggregates Insights data from the Google My Business API and stores it into Google Cloud Platform, precisely in BigQuery. Insights data provides details around how users interact with Google My Business listings via Google Maps, such as the number of queries for a location, locations of where people searched for directions, number of website clicks, calls, and reviews. The tool provides a cross-account look at the data instead of a per-location view.
Along with gathering stats, the Google Cloud Natural Language API is used to provide sentiment analysis and entity detection for supported languages. This is a fully automated process that pulls data from the Google My Business API and places it into a BigQuery instance, processing each review's content with the Natural Language API to generate a sentiment score for analysis.
- All locations must roll up to a
Location Group
(formerly known asBusiness Account
). Click here for more information. Multiple location groups are supported and can be queried accordingly (refer to the samples inside the sql directory. - All locations must be verified
Install the required dependencies
$ pip install --upgrade --quiet --requirement requirements.txt
Follow the steps to Enable the API within the Google My Business basic setup guide and create the necessary OAuth2 Credentials required for the next steps.
Go to Enable the Cloud Natural Language API.
Go to Enable the BigQuery API.
Please note that BigQuery provides a sandbox if you do not want to provide a credit card or enable billing for your project. The steps in this topic work for a project whether or not your project has billing enabled. If you optionally want to enable billing, see Learn how to enable billing.
Create a file named client_secrets.json
, with the credentials downloaded as JSON from your Google Cloud Platform Project API Console.
Go to the Samples page, right click Download discovery document, and select Save Link As. Then, save the file as gmb_discovery.json
in the same directory.
Execute the script to start the process of retrieving the reviews for all available locations from all accessible accounts for the authorized user:
$ python main.py --project_id=<PROJECT_ID>
The script generates a number of tables in an alligator
BigQuery dataset.
Usage:
$ python main.py [-h] -p PROJECT_ID [-a ACCOUNT_ID] [-l LOCATION_ID]
[--no_insights] [--no_reviews] [--no_sentiment]
[--no_directions] [--no_hourly_calls] [--sentiment_only] [-v]
Optional arguments:
-h, --help show this help message and exit
-p PROJECT_ID, --project_id PROJECT_ID
a Google Cloud Project ID
-a ACCOUNT_ID, --account_id ACCOUNT_ID
retrieve and store all Google My Business reviews for
a given Account ID
-l LOCATION_ID, --location_id LOCATION_ID
retrieve and store all Google My Business reviews for
a given Location ID (--account_id is also required)
--language LANG
the ISO-639-1 language code in which the Google My Business
reviews are written (used for sentiment processing). See
https://cloud.google.com/natural-language/docs/languages
for a list of supported languages
--no_insights skip the insights processing and storage
--no_reviews skip the reviews processing and storage
--no_sentiment skip the sentiment processing and storage
--no_directions skip the directions processing and storage
--no_hourly_calls skip the hourly calls processing and storage
--sentiment_only only process and store the sentiment of all available
reviews since the last run (if --no-sentiment is
provided, no action is performed)
-v, --verbose increase output verbosity
For the initial data load into BigQuery, a maximum of 18 months of insights data will be retrieved, up to 5 days prior to the current date. This is due to the posted 3-5 day delay on the data becoming available in the Google My Business API. For phone calls and driving directions, only data from the last 7 days is retrieved. Finally, data is inserted into BigQuery with a batch size of 5000 to avoid running into API limits, especially when using the BigQuery Sandbox. These defaults are defined in api.py and can be tuned according to indiviual needs.
Furthermore, all available reviews in BigQuery will be used only for the first run of the sentiment analysis. Once the analysis is complete, an empty file named sentiments_lastrun
will be created in the application's root directory, and this file's modification timestamp will be used for subsequent sentiment analysis runs so that only non-analyzed reviews are taken into consideration. Delete the file to rerun the analysis on all available reviews.
Finally, you can use the --language
CLI flag to set the desired language that the Cloud Natural Language API should use for the sentiment analysis. This is particularly useful for reviews which may contain multiple languages. Refer to this post for a list of languages supported by the API. You might need to deactivate one or more of the text annotation features in api.py accordingly if your language is not yet supported.
- Tony Coconate (coconate@google.com) – Google
- Miguel Fernandes (miguelfc@google.com) – Google
- Mohab Fekry (mfekry@google.com) – Google
- David Harcombe (davidharcombe@google.com) – Google