Data Scientist Portfolio

Welcome!

Hi! I'm Felipe Valory, a data scientist skilled at transforming data into valuable insights that drive business results.

I’m experienced in every step of creating data-driven business solutions, from understanding the problem to deploying the model in production using cloud platforms.

About My Experience

I’ve had the opportunity to develop solutions for important business challenges, such as:

  • Forecasting property prices.
  • Identifying customers at risk of churn.
  • Predicting sales trends.
  • Creating data-driven dashboards for management.
  • Designing movie recommendation systems and more.

Click here to dive into my portfolio!

What You'll Find

About Me

A quick introduction to who I am, including my background, experiences, and career path, along with what drives me in the world of Data Science.

Projects

A showcase of the most impactful projects I’ve developed, demonstrating my hands-on skills and ability to solve real-world problems using data. Here are some highlights:

Project 1: Health Insurance Cross-sell

  • Description: This project aims to rank a list of potential customers based on their likelihood of purchasing car insurance (propensity score).
  • Technologies Used: Python, Pandas, Sweetviz, Seaborn, Scikit-Learn, Machine Learning models, Git, Agile Methodology.
  • Repository: GitHub Link

Project 2: Predicting Customer Churn and Retention for a Bank

  • Description: A Machine Learning model to predict customer churn (account cancellations) for a bank.
  • Technologies Used: Python, Pandas, Numpy, Seaborn, Scikit-Learn, Random Forest, Git, Agile Methodology.
  • Repository: GitHub Link

Project 3: Sales Forecast for a Drugstore Chain

  • Description: A Machine Learning model to predict the next six weeks of sales for Rossmann stores.
  • Technologies Used: Python, Pandas, Numpy, Flask, Inflection, Seaborn, Scikit-Learn, Boruta, Linear Regression, Random Forest, XGBoost, Git, Heroku Cloud, Agile Methodology.
  • Repository: GitHub Link

Project 4: Delivery Data Analysis

  • Description: A dashboard to monitor strategic KPIs for a delivery company.
  • Technologies Used: Python, Pandas, Matplotlib, Plotly, Streamlit, Streamlit Cloud, Git, Jupyter Lab.
  • Repository: GitHub Link

Project 5: Machine Learning Experiments

  • Description: Performance analysis of Regression, Classification, and Clustering models.
  • Technologies Used: Regression, Classification, and Clustering Algorithms, Performance Metrics (RMSE, MSE, MAE, MAPE, Precision, Recall, ROC, F1-Score, Silhouette Score, R2), Scikit-Learn.
  • Repository: GitHub Link

Project 6: Airbnb Dashboard

  • Description: Data analysis for Airbnb revenue and customer behavior to generate insights.
  • Technologies Used: Power Query, Power BI.
  • Repository: Access the Dashboard

Tools

  • Data Collect and Storage: SQL, MySQL and Postgres
  • Data Process and Analysis: Python and statistics
  • Development: Git, Scrum, Project Management
  • Data Visualization: Power BI, Looker and Streamlit web
  • Machine Learning Modeling: Classification, Regression and Clustering
  • Machine Learning Deployment: AWS Cloud, Heroku, Streamlit Cloud, Google Cloud Platform (GCP)

Let's Connect

I'm excited to discuss how my Data Science skills can add value to your team. Feel free to reach out!

Gmail
LinkedIn


Thank you for visiting my portfolio. Let’s turn data into strategic decisions together!