
Quickly uncover potential relationships in a CSV dataset by getting an overview of correlation coefficients between several pairs of metrics.

Primary LanguagePython

💡 Bulk correlations

A simple python app to quickly uncover potential relationships in a CSV dataset by getting an overview of correlation coefficients between several pairs of metrics. Made with Streamlit.



  1. Upload a CSV file from the left sidebar.
  • Except for the first column, all columns must contain numeric values.

  • Make sure that your CSV file is using comma as separator (not semicolon).

  • Make sure that the column names of your CSV file doesn't contain special characters such as parentheses or quotes (white spaces, hyphens and underscores are okay).

The format expected is:


For example, for a GA4 export:

Address,Word Count,GA4 Sessions,GA4 Views,GA4 Engaged sessions,GA4 Bounce rate,Performance Score,First Contentful Paint Time (ms),Speed Index Time (ms),Largest Contentful Paint Time (ms),Time to Interactive (ms),Total Blocking Time (ms),Cumulative Layout Shift,Image Count
  1. The correlation matrix automatically appears under "Results:"

alt text

  1. Once the correlation matrix is displayed, a dropdown list containing all the metrics pairs will appear. The list is sorted from most correlating (positively or negatively) to least correlating. After selecting a metrics pair, a scatter plot will appear below showing the actual distribution of the data points.

alt text

alt text