This code provides a data explorer and visualizations for a given dataset. The following is a brief explanation of the code:
-
Importing required libraries:
- streamlit: used for building interactive web apps
- pandas: used for data manipulation and analysis
- matplotlib: used for creating visualizations
- seaborn: used for creating statistical graphics
-
Loading the dataset:
- The function
load_data()
is used to load the dataset from a CSV file ("data.csv"). - The
@st.cache
decorator is used to cache the results of the function for better performance.
- The function
-
Selecting columns:
- A sidebar is created using
st.sidebar.title()
to select columns of the dataset to be displayed. st.sidebar.multiselect()
is used to create a multiselect widget to choose columns of the dataset.
- A sidebar is created using
-
Displaying data:
- The selected columns of the dataset are displayed using
st.write()
anddf[columns]
.
- The selected columns of the dataset are displayed using
-
Creating visualizations:
- Two types of visualizations are created using
matplotlib
andseaborn
libraries: Histogram and Scatterplot. - A histogram of the first selected column is created using
plt.subplots()
andax.hist()
. The visualization is then displayed usingst.pyplot()
. - A scatterplot is created using
sns.scatterplot()
with the first two selected columns of the dataset. The visualization is then displayed usingst.pyplot()
.
- Two types of visualizations are created using
To run the code, the following steps should be followed:
- Install the required libraries using
pip install streamlit pandas matplotlib seaborn
- Save the dataset as a CSV file named "data.csv" in the same directory as the code file.
- Run the code using
streamlit run <filename>.py
in the terminal/command prompt.
Sure, here's an example README.md
file for this code:
Once the app is running, you'll see a sidebar on the left where you can select the columns you want to explore. Simply check the boxes next to the column names, and the app will display the corresponding data.
You can also switch to the "Visualizations" tab to see histograms and scatterplots of the selected columns.
This app uses a sample dataset called data.csv
. You can replace this file with your own CSV file by updating the load_data()
function in app.py
.
This app was built using the following libraries:
- Streamlit: https://streamlit.io/
- Pandas: https://pandas.pydata.org/
- Matplotlib: https://matplotlib.org/
- Seaborn: https://seaborn.pydata.org/
This project is licensed under the MIT License - see the LICENSE file for details.