This project demonstrates how to set up a Dropbox User Sentiment Analysis Dashboard using Airbyte for data extraction, Motherduck (DuckDB) for storage and querying, and Streamlit for visualization. π
The goal is to analyze user reviews of the Dropbox app using sentiment analysis techniques. Here's the workflow breakdown:
- Dataset Source: A CSV dataset of Dropbox app user reviews (from Kaggle).
- Preprocessing: Data uploaded to Google Sheets for formatting.
- Airbyte Integration: Google Sheets (source) connected to Motherduck (destination) via Airbyte.
- Destination Setup: Motherduck stores data in DuckDB.
- Sentiment Analysis: Python and Streamlit dashboard for data visualization.
π Live Demo: Streamlit Dashboard
π Blog Post: Detailed Guide on Sentiment Analysis
- Airbyte: Data Integration
- Motherduck (DuckDB): Database Management
- Streamlit: Data Visualization
- Python: Backend and Sentiment Analysis
- TextBlob: Sentiment Analysis Library
- Plotly: Data Visualization Library
DROPBOX-REVIEWS-ANALYSIS
βββ .devcontainer
β βββ devcontainer.json
βββ .streamlit
β βββ config.toml
βββ assets
β βββ main.png
βββ dropbox-reviews-analytics
β βββ src
β β βββ config
β β β βββ __init__.py
β β β βββ config.py
β β βββ utils
β β β βββ __init__.py
β β β βββ database.py
β β βββ app.py
βββ .env
βββ venv
βββ .gitignore
βββ LICENSE.md
βββ README.md
βββ requirements.txt
- .streamlit/config.toml: UI customization for Streamlit.
- src/config/config.py: Handles environment variables.
- src/utils/database.py: Database queries with Motherduck.
- src/app.py: Core Streamlit app logic.
- .env: Stores secure environment variables.
git clone https://github.com/abhirajadhikary06/Dropbox-Sentiment-Analysis.git
cd Dropbox-Sentiment-Analysis
python -m venv venv
source venv/bin/activate # On macOS/Linux
venv\Scripts\activate # On Windows
pip install -r requirements.txt
Create a .env
file in the root directory and add:
MOTHERDUCK_TOKEN=your_motherduck_api_key
streamlit run src/app.py
The app will be available at http://localhost:8501.
Using TextBlob, we calculate sentiment polarity and subjectivity.
from textblob import TextBlob
def get_sentiment(text):
blob = TextBlob(str(text))
return blob.sentiment.polarity if sentiment_type == "Polarity" else blob.sentiment.subjectivity
import duckdb
from config.config import MOTHERDUCK_TOKEN
def get_connection():
return duckdb.connect(f"md:?token={MOTHERDUCK_TOKEN}")
def get_reviews_for_sentiment():
conn = get_connection()
query = """
SELECT content, score FROM dropbox_reviews WHERE content IS NOT NULL
"""
return conn.execute(query).fetch_df()
import plotly.express as px
fig = px.histogram(reviews_df, x='sentiment', title='Sentiment Distribution')
st.plotly_chart(fig)
- Avoid specifying exact library versions in
requirements.txt
. - Ensure
.env
is correctly configured. - Validate database connection tokens during runtime.
Deployment Steps:
- Load
.env
variables. - Connect securely to Motherduck.
- Serve the dashboard via Streamlit.
- Improve dashboard interactivity.
- Add real-time review updates.
- Expand to analyze multiple datasets.
π Complete Project on GitHub: GitHub Repository
π Live Demo: Streamlit Dashboard
π Motherduck Instance:
-- Run this snippet to attach database
ATTACH 'md:_share/abhiraj_db/275eb3cc-2d8b-4705-a787-39c8010e8b2f';
This project is licensed under the Creative Commons Zero v1.0 Universal.