Machine Learning Projects

Welcome! This repository contains a selection of data science and machine learning projects I’ve worked on, demonstrating a broad skill set across predictive modeling, natural language processing, deep learning, and more. Each project includes an overview, methodology, key results, and insights, along with the code and relevant data files.

Table of Contents

  1. Waiter Tips Analysis & Prediction
  2. Language Detection
  3. Text Emotions Classification
  4. Telco Customer Churn Prediction
  5. Future Sales Prediction
  6. Electricity Price Prediction
  7. Online Payments Fraud Detection
  8. Classification with Neural Networks
  9. Stress Detection

Core Skills Demonstrated

  • Data Preprocessing & EDA: Feature engineering, handling missing values, visualizations.
  • Model Building: Supervised & unsupervised learning, time series forecasting, neural networks.
  • Natural Language Processing (NLP): Sentiment analysis, text classification, language detection.
  • Deep Learning: Neural networks, LSTM, transformer models.
  • Model Evaluation & Optimization: Cross-validation, grid search, and performance metrics.

Project Descriptions

1. Waiter Tips Analysis & Prediction

Goal: Analyze and predict the tips waitstaff receive based on factors like party size and meal time.

  • Techniques: Linear Regression, EDA with visualizations.
  • Results: Identified key factors affecting tips, allowing for better waitstaff insights.
  • Code: Link to Project

2. Language Detection with Machine Learning

Goal: Detect the language of given text samples.

  • Techniques: NLP, Naive Bayes, Logistic Regression.
  • Results: Achieved high classification accuracy across multiple languages.
  • Code: Link to Project

3. Text Emotions Classification Using Python

Goal: Classify text data into emotions (e.g., joy, anger, sadness).

  • Techniques: Natural Language Processing (NLP) with TF-IDF, Naive Bayes, and SVM.
  • Results: High accuracy on multi-class emotion classification.
  • Code: Link to Project

4. Predicting Telco Customer Churn

Goal: Predict customer churn for a telecommunications company.

  • Techniques: Logistic Regression, Random Forest, and XGBoost.
  • Results: Enabled insights into churn drivers and implemented predictive models.
  • Code: Link to Project

5. Future Sales Prediction

Goal: Forecast future sales for a retail company using historical sales data.

  • Techniques: Time series forecasting with ARIMA, Prophet, and LSTM models.
  • Results: Accurate forecasting of weekly sales, aiding inventory and resource management.
  • Code: Link to Project

6. Electricity Price Prediction

Goal: Predict future electricity prices using time series data.

  • Techniques: Time series analysis, LSTM and GRU models for sequential data.
  • Results: Successfully forecasted short-term price fluctuations.
  • Code: Link to Project

7. Online Payments Fraud Detection

Goal: Develop a model to detect fraudulent online transactions and reduce false positives.

  • Techniques: Logistic Regression, Random Forest, Isolation Forest, and SMOTE for handling imbalanced data.
  • Results: Achieved high recall with minimized false positives, crucial for effective fraud detection.
  • Code: Link to Project

8. Classification with Neural Networks Using Python

Goal: Explore neural networks for classification on structured datasets.

  • Techniques: Multilayer Perceptron (MLP) networks, hyperparameter tuning.
  • Results: Improved performance over traditional models, demonstrating the power of neural networks.
  • Code: Link to Project

9. Stress Detection

Goal: Detect stress levels based on physiological data.

  • Techniques: Data preprocessing, feature engineering, ensemble models.
  • Results: Identified patterns indicating stress, opening paths for real-time applications.
  • Code: Link to Project

Key Insights & Impact

These projects highlight my proficiency in data science and machine learning with practical applications across various domains. I enjoy the challenge of translating data into actionable insights and building predictive models that make a tangible impact.


Tools & Technologies

  • Programming: Python (Pandas, NumPy, Scikit-Learn), Jupyter Notebooks
  • Visualization: Matplotlib, Seaborn, Plotly
  • Machine Learning: Scikit-Learn, TensorFlow, Keras
  • NLP: NLTK, TF-IDF, Word Embeddings
  • Time Series Analysis: ARIMA, Prophet, LSTM