/PGA

This is a library for making batch request to Google Analytics Core Reporting v3 API and extracting data from Google Analytics property into Python 3 data structures.

Primary LanguagePython

Python Google Analytics Library
(Core Reporting API v3 support)

This is a library for making batch request to Google Analytics Core Reporting v3 API and extracting data from Google Analytics property into Python 3 data structures.

The package uses

  • OAuth 2.0 (protocol) client or server access to Google Analytics API (oauth2client==3.0.0) - for connection to Google Analytics

  • Core Reporting v3 API Google Analytics - for extracting data

  • Metadata API Google Analytics - integrated dimensions or metrics reference lookup

  • Management API Google Analytics - to get View, Property and Account tree.

Dependency:

  • Pandas > 0.13.0 - for transformation data into pandas DataFrame object

  • Numpy > 1.0.0 - for slice numpy array chunk

  • google-api-python-client > 1.5.0 - self explanatory

Best practices usage:

  • Interactive shell Jupyter for analyzing data

Installation

  • Via pip: use the following command: # sudo pip install pga

Latest version of Pandas, Numpy and oauth2client will be automatically installed as a dependency.

Authentication

First of all you will need to get google client_secret json file from Google API Console

You may choose the following types of Client ID :

  • for Service account client

  • for Web application

PGA.init

pga_init

PGA.init(key_file_location=None,type_of_connection=None,facet_chunk=10,count_day_slice=1)

Constructor and set parameters for instance basic functionality.

Parameters: key_file_location : string
Set path for secret json file
type_of_connection : string
Available methods are Client’, ‘Server’ If use service account, then choose ‘Server’, if use web applicatio use ‘Client.’
facet_chunk : int, optional
Set a number of chunk,which execute all parallels request. More detail about this technology. Important things - Google Universal Analytics make execute only 10 parallel request in one second, if you want more - contact with a Google form to increase this limit.
count_day_slice : int, optional
Set a number of days,which need to slice [start-date, end-date] in your request.
For example:
(input)
{‘count_day_slice’:2, 'start_date' : '2016-12-01','end_date' : '2016-12-05'}
(output)
[{ 'start_date' : '2016-12-01','end_date' : '2016-12-02'},
{ 'start_date' : '2016-12-03','end_date' : '2016-12-04'},
{ 'start_date' : '2016-12-05','end_date' : '2016-12-05'}]
Returns: self : self
return self with current behavior.

After apply constructor will be create the instance, and redirect the client to a browser for authentication with Google.

Request add

Simply add request in an already instantiated object pga

request add

Request.add_settings_request

Request**.add_settings_request(****settings_products)

Parameters: **settings_products : kwargs
Specify json request formats Core V3, list of query parameters - https://developers.google.com/analytics/devguides/reporting/core/v3/reference?hl=ru#q_summary
Returns: self : self return self with current behavior.

You can update any already used query parameters later with the following method, and make new request. ![image alt text]reqest add 2

Execute DataFram****e

Execute all settings for get DataFrame

execute all settings

PGA.get_dataframe

PGA.get_dataframe(groupby=True)

Parameters: groupby : boolean
Available methods are ‘True’, ‘False’
if choose True then DataFrame groupby all date by all dimensions, dates, and start-index. Also all columns apply appropriate type based on Google Analytics MetaData API.
if choose False then DataFrame doesn’t groupby data. It made for use some other library which can fast aggregate and groupby data, because in some cases data is too large and this process is very low. You may pay attention in to this project - http://dask.pydata.org/en/latest/
Returns: data : pandas.DataFrame object

Get settings pga

All settings

Print all current settings pga:

PGA.get_all_settings

PGA.get_all_settings()

Returns: all settings : pandas.DataFrame object

All products

Print all current product settings pga

PGA.get_all_products

PGA.get_all_products()

Returns: all settings : pandas.DataFrame object

Additional extra apps

ExtraAppsMetaCdm

Lookup through metadata of Google Analytics dimensions and metrics:

extraappsmetacdm

ExtraAppsMetaCdm.get_list_cdcm

ExtraAppsMetaCdm.get_list_cdcm(clarify=None)

Parameters: clarify : string Specifying the attribute on which the selection will be dimensions and metris
Returns: Table of information : pandas.DataFrame object

ExtraAppsManagementAPI

Get the list of Google Universal Analytics (Account ID, Property id, View id) objects, you have an access to.

extraappsmanagementapi

PGA.get_all_profile

PGA.get_all_profile()

Returns: Table of information with dimensions or metrics: pandas.DataFrame object