Exploring path sequences in GA4 BigQuery data
- Create a new Google Cloud project
- Enable the BigQuery API
- Ensure that GA4 data is being sent to BigQuery
- Get the dataset ID and table ID for the GA4 data
- Create a service account with BigQuery read access. Here is a good guide.
- Download the service account key as a JSON file
- Create a new file in the root directory called
service_account.json
and paste the contents of the JSON file into it - Run
pip install -r requirements.txt
to install the required Python packages - Get your API key from OpenAI if you want to run analyze_clusters.
- Open
demo.ipynb
in Jupyter Notebook and run the cells
plot_important_features_prefixspan
: Plots the important conversion path sequences of the PrefixSpan model.convertor_review
: Sequence Patterns of Similarity and Anomalies in Non-Convertors that are clustered with Convertorsanalyze_divergence
: Scores the similarity of non-convertor to convertor sequences.analyze_clusters
: Clusters users based on their navigational paths and labels clusters.
- Add more documentation
- Update sequence importance for sequences that pass through certain pages.
- Add more sequence mining algorithms
- Add attribution models
- Remove sequences after conversion
- Add sequence divergence
- Analysis by section
- Analysis through product/service page
- Analysis through blog
- Analysis through pricing page
- Analysis by source/medium
- Analysis to score pages based on their importance (presence in conversion and closeness to conversion)