Tech-with-Vidhya
Hello, Welcome to my github portfolio page. It includes all the Data Science, Machine Learning Engineering, NLP, GenAI, LLMs and Big Data Engineering Projects
AI/ML/Data Engineer & Solutions ArchitectMars | Queen Mary University of London | UK
Pinned Repositories
Automated_ETL_Finance_Data_Pipeline_with_AWS_Lambda_Spark_Transformation_Job_Python
This project covers the implementation of building an automated ETL data pipeline using Python and AWS Services with Spark transformation job for financial stocks trade transactions. The ETL Data Pipeline is automated using AWS Lambda Function with a Trigger defined. Whenever a new file is ingested into the AWS S3 Bucket; then the AWS Lambda Function gets triggered and will implement the further action to execute the AWS Glue Crawler ETL Spark Transformation Job. The Spark Transformation Job implemented using Python PySpark transforms the trade transactions data stored in the AWS S3 Bucket; further to filter a sub-set of trade transactions for which the total number of shares transacted are less than or equal to 100. Tools & Technologies: Python, Boto3, PySpark, SDK, AWS CLI, AWS Virtual Private Cloud (VPC), AWS VPC Endpoint, AWS S3, AWS Glue, AWS Glue Crawler, AWS Glue Jobs, AWS Athena, AWS Lambda, Spark
AWS_SageMaker_TensorFlow_Keras_CNN_Model_Fashion_MNIST
This is an AWS SageMaker TensorFlow Keras CNN Machine Learning Project.
bank_customers_churn_prediction_exploring_7_different_classification_algorithms
This project deals with the classification of the bank customers on whether a customer will leave the bank (i.e.; churn) or not, by applying the below steps of a Data Science Project Life-Cycle 1. Data Exploration, Analysis and Visualisations 2. Data Pre-processing 3. Data Preparation for the Modelling 4. Model Training 5. Model Validation 6. Optimized Model Selection based on Various Performance Metrics 7. Deploying the Best Optimized Model into Unseen Test Data 8. Evaluating the Optimized Model’s Performance Metrics The business case of determining the churn status of bank customers are explored, trained and validated on 7 different classification algorithms/models as listed below and the best optimized model is selected based on the accuracy metrics. 1. Decision Tree Classifier - CART (Classification and Regression Tree) Algorithm 2. Decision Tree Classifier - IDE (Iterative Dichotomiser) Algorithm 3. Ensemble Random Forest Classifier Algorithm 4. Ensemble Adaptive Boosting Classifier Algorithm 5. Ensemble Hist Gradient Boosting Classifier Algorithm 6. Ensemble Extreme Gradient Boosting (XGBoost) Classifier Algorithm 7. Support Vector Machine (SVM) Classifier Algorithm
Bitcoin_Network_Analytics_using_Python_NetworkX_and_Gephi
This group project of 4 members is delivered as part of my Masters in Big Data Science (MSc BDS) Program Module named “Digital Media and Social Network” in Queen Mary University of London (QMUL), London, United Kingdom. This project covers the network analysis covering 4 different problem statements and use cases using python NetworkX package, Gephi network analysis tool and Microsoft excel. Dataset: Dataset includes Bitcoin Trade Transactions for the period between 2011 to 2016. Dataset Representation: Bitcoin Trade Transactions -> Attributes (Rater, Ratee, Rating and Timestamp) Network Formation: For every trade transaction between 2 users in the Bitcoin Network; ratings are recorded and tracked in the system with the corresponding timestamp (Directed Network). Size of the Dataset and Network: Users/Nodes = 5881 Transactions/Edges = 35592 Ratings (in the range of -10 to +10; where -10 represents the least rating and +10 represents the highest rating) Basic Network Statistics: Use Cases and Objectives:
credit-risk-assessment-fintech-framework-using-deep-learning-and-transfer-learning
This project represents the credit risk assessment dual framework of predicting credit scores and the forecasts of credit default risk of the consumers of the financial institutions like commercial banks and lending firms. The implementation is dealt that mimics the real-world FICO Scoring Model with the custom enhancements to include lender's internal credit risk factors by proposing a new Domain-Tech Feature Selection Approach along with Deep Learning and Transfer Learning techniques. This is the masters final project delivered as part of my course of studying Masters in Big Data Science program at Queen Mary University of London (QMUL), United Kingdom.
ETL_Finance_Data_Pipeline_Python_AWS_CLI_S3_Glue_Athena
This project covers the implementation of building a ETL data pipeline using Python and AWS Services for financial stocks trade transactions. Tools & Technologies: Python, Boto3 SDK, AWS CLI, AWS Virtual Private Cloud (VPC), AWS VPC Endpoint, AWS S3, AWS Glue, AWS Glue Crawler, AWS Athena, AWS Redshift
MLOps_AWS_Kubernetes_LoadBalancing_Docker_Flask_Banking_Customers_Digital_Transformation_Classifier
This is an AWS MLE and MLOps Bank Customers Digital Transformation Project.
NLP_Multi-Class_Text_Classification_using_BERT_Model
NLP_Text_Classification_with_Transformers_RoBERTa_and_XLNet_Models
productionized_docker_ML_model_application_into_kubernetes_cluster_using_AWS_EKS_CloudFormation_EMR
This project covers the end to end implementation of deploying and productionizing a dockerized/containerized machine learning python flask application into Kubernetes Cluster using the AWS Elastic Kubernetes Service (EKS), AWS Serverless Fargate Instances, AWS CloudFormation Cloud Stack and AWS Elastic Container Registry (ECR) Service. The machine learning business case implemented in this project includes a bank note authentication binary classifier model using Random Forest Classifier; which predicts and classifies a bank note either as a Fake Bank Note (Label 0) or a Genuine Bank Note (Label 1). Implementation Steps: 1. Creation of an end to end machine learning solution covering all the ML life-cycle steps of Data Exploration, Feature Selection, Model Training, Model Validation and Model Testing on the unseen production data. 2. Saved the finalised model as a pickle file. 3. Creation of a Python Flask based API; in order to render the ML model solution and inferences to the end-users. 4. Verified and tested the working status of the Python Flask API in the localhost set-up. 5. Creation of a Docker File (containing the steps/instructions to create a docker image) for the Python Flask based Bank Note Authentication Machine Learning Application embedded with Random Forest ML Classifier Model. 6. Creation of IAM Service Roles with appropriate policies to access the AWS Elastic Container Registry (ECR) Service and AWS Elastic Kubernetes Service (EKS) and AWS CloudFormation Service. 7. Created a new EC2 Linux Server Instance in AWS and copied the web application project’s directories and its files into the AWS Linux Server using SFTP linux commands. 8. Installed the Docker software and the supporting python libraries in the AWS EC2 Linux Server Instance; as per the “requirements.txt” file. 9. Transformation of the Docker File into a Docker Image and Docker Container representing the application; using docker build and run commands. 10. Creation of a Docker Repository within the AWS ECR Service and pushed the application docker image into the repository using AWS Command Line Interface (CLI) commands. 11. Creation of the Cloud Stack with private and public subnets using the AWS CloudFormation Service with appropriate IAM roles and policies. 12. Creation of the Kubernetes Cluster using the AWS EKS Service with appropriate IAM roles and policies and linked the cloud stack created using the AWS CloudFormation Service. 13. Creation of the AWS Serverless Fargate Profile and Fargate instances/nodes. 14. Creation and configured the “Deployment.yaml” and “Service.yaml” files using the Kubernetes kubectl commands. 15. Applied the “Deployment.yaml” with pods replica configuration to the AWS EKS Cluster Fargate Nodes; using the Kubernetes kubectl commands. 16. Applied the “Service.yaml” using the Kubernetes kubectl commands; to render and service the machine learning application to the end-users for public access with the creation of the production end-point. 17. Verified and tested the inferences of the productionized ML Application using the AWS Fargate end-point created in the AWS Kubernetes EKS Cluster. Tools & Technologies: Python, Flask, AWS, AWS EC2, Linux Server, Linux Commands, Command Line Interface (CLI), Docker, Docker Commands, AWS ECR, AWS IAM, AWS CloudFormation, AWS EKS, Kubernetes, Kubernetes kubectl Commands.
Tech-with-Vidhya's Repositories
Tech-with-Vidhya/productionized_docker_ML_model_application_into_kubernetes_cluster_using_AWS_EKS_CloudFormation_EMR
This project covers the end to end implementation of deploying and productionizing a dockerized/containerized machine learning python flask application into Kubernetes Cluster using the AWS Elastic Kubernetes Service (EKS), AWS Serverless Fargate Instances, AWS CloudFormation Cloud Stack and AWS Elastic Container Registry (ECR) Service. The machine learning business case implemented in this project includes a bank note authentication binary classifier model using Random Forest Classifier; which predicts and classifies a bank note either as a Fake Bank Note (Label 0) or a Genuine Bank Note (Label 1). Implementation Steps: 1. Creation of an end to end machine learning solution covering all the ML life-cycle steps of Data Exploration, Feature Selection, Model Training, Model Validation and Model Testing on the unseen production data. 2. Saved the finalised model as a pickle file. 3. Creation of a Python Flask based API; in order to render the ML model solution and inferences to the end-users. 4. Verified and tested the working status of the Python Flask API in the localhost set-up. 5. Creation of a Docker File (containing the steps/instructions to create a docker image) for the Python Flask based Bank Note Authentication Machine Learning Application embedded with Random Forest ML Classifier Model. 6. Creation of IAM Service Roles with appropriate policies to access the AWS Elastic Container Registry (ECR) Service and AWS Elastic Kubernetes Service (EKS) and AWS CloudFormation Service. 7. Created a new EC2 Linux Server Instance in AWS and copied the web application project’s directories and its files into the AWS Linux Server using SFTP linux commands. 8. Installed the Docker software and the supporting python libraries in the AWS EC2 Linux Server Instance; as per the “requirements.txt” file. 9. Transformation of the Docker File into a Docker Image and Docker Container representing the application; using docker build and run commands. 10. Creation of a Docker Repository within the AWS ECR Service and pushed the application docker image into the repository using AWS Command Line Interface (CLI) commands. 11. Creation of the Cloud Stack with private and public subnets using the AWS CloudFormation Service with appropriate IAM roles and policies. 12. Creation of the Kubernetes Cluster using the AWS EKS Service with appropriate IAM roles and policies and linked the cloud stack created using the AWS CloudFormation Service. 13. Creation of the AWS Serverless Fargate Profile and Fargate instances/nodes. 14. Creation and configured the “Deployment.yaml” and “Service.yaml” files using the Kubernetes kubectl commands. 15. Applied the “Deployment.yaml” with pods replica configuration to the AWS EKS Cluster Fargate Nodes; using the Kubernetes kubectl commands. 16. Applied the “Service.yaml” using the Kubernetes kubectl commands; to render and service the machine learning application to the end-users for public access with the creation of the production end-point. 17. Verified and tested the inferences of the productionized ML Application using the AWS Fargate end-point created in the AWS Kubernetes EKS Cluster. Tools & Technologies: Python, Flask, AWS, AWS EC2, Linux Server, Linux Commands, Command Line Interface (CLI), Docker, Docker Commands, AWS ECR, AWS IAM, AWS CloudFormation, AWS EKS, Kubernetes, Kubernetes kubectl Commands.
Tech-with-Vidhya/credit-risk-assessment-fintech-framework-using-deep-learning-and-transfer-learning
This project represents the credit risk assessment dual framework of predicting credit scores and the forecasts of credit default risk of the consumers of the financial institutions like commercial banks and lending firms. The implementation is dealt that mimics the real-world FICO Scoring Model with the custom enhancements to include lender's internal credit risk factors by proposing a new Domain-Tech Feature Selection Approach along with Deep Learning and Transfer Learning techniques. This is the masters final project delivered as part of my course of studying Masters in Big Data Science program at Queen Mary University of London (QMUL), United Kingdom.
Tech-with-Vidhya/Bitcoin_Network_Analytics_using_Python_NetworkX_and_Gephi
This group project of 4 members is delivered as part of my Masters in Big Data Science (MSc BDS) Program Module named “Digital Media and Social Network” in Queen Mary University of London (QMUL), London, United Kingdom. This project covers the network analysis covering 4 different problem statements and use cases using python NetworkX package, Gephi network analysis tool and Microsoft excel. Dataset: Dataset includes Bitcoin Trade Transactions for the period between 2011 to 2016. Dataset Representation: Bitcoin Trade Transactions -> Attributes (Rater, Ratee, Rating and Timestamp) Network Formation: For every trade transaction between 2 users in the Bitcoin Network; ratings are recorded and tracked in the system with the corresponding timestamp (Directed Network). Size of the Dataset and Network: Users/Nodes = 5881 Transactions/Edges = 35592 Ratings (in the range of -10 to +10; where -10 represents the least rating and +10 represents the highest rating) Basic Network Statistics: Use Cases and Objectives:
Tech-with-Vidhya/productionized_docker_ML_model_application_into_AWS_EC2_Linux
This project covers the end to end implementation of deploying and productionizing a dockerized/containerized machine learning python flask application into AWS Elastic Compute Cloud (EC2) Instance and AWS Elastic Container Registry (ECR) Service. The machine learning business case implemented in this project includes a bank note authentication binary classifier model using Random Forest Classifier; which predicts and classifies a bank note either as a Fake Bank Note (Label 0) or a Genuine Bank Note (Label 1). The implementation includes below steps: 1. Creation of an end to end machine learning solution covering all the ML life-cycle steps of Data Exploration, Feature Selection, Model Training, Model Validation and Model Testing on the unseen production data. 2. Saved the finalised model as a pickle file. 3. Creation of a Python Flask based API; in order to render the ML model solution and inferences to the end-users. 4. Verified and tested the working status of the Python Flask API in the localhost set-up. 5. Creation of a Docker File (containing the steps/instructions to create a docker image) for the Python Flask based Bank Note Authentication Machine Learning Application embedded with Random Forest ML Classifier Model. 6. Creation of IAM Service Roles with appropriate policies to access the AWS Elastic Container Registry (ECR) Service and AWS Elastic Compute Cloud (EC2) Service. 7. Created a new EC2 Linux Server Instance in AWS and copied the web application project’s directories and its files into the AWS Linux Server using SFTP linux commands. 8. Installed the Docker software and the supporting python libraries in the AWS EC2 Linux Server Instance; as per the “requirements.txt” file. 9. Transformation of the Docker File into a Docker Image and Docker Container representing the application; using docker build and run commands. 10. Creation of a Docker Repository within the AWS ECR Service and pushed the application docker image into the repository using AWS Command Line Interface (CLI) commands. 11. Deployment of the dockerized/containerized Python Flask ML application into the AWS EC2 Linux Instance; with the creation of the production end-point. 12. Verified and tested the inferences of the productionized ML Application using the AWS EC2 end-point. Tools & Technologies: Python, Flask, AWS, AWS EC2, Linux Server, Linux Commands, Command Line Interface (CLI), Docker, Docker Commands, AWS ECR, AWS IAM
Tech-with-Vidhya/MLOps_AWS_LoadBalancing_Docker_Flask_Terraform_Banking_Customers_Churn_Prediction_Ensemble_Technique
This is an AWS MLE and MLOps Bank Customers Churn Prediction Project.
Tech-with-Vidhya/AWS_ETL_NLP_Auto-Reply_Query_Handler_Using_Kafka_Spark_LSTM_Deep_Learning
Tech-with-Vidhya/dockerizing_credit_risk_assessment_python_flask_web_app_ML_models_deployment_AWS_ECR_ECS_Fargate_EC2
This project covers the implementation of dockerizing a python flask based credit risk assessment calculator web application integrated with two different deep learning and transfer learning based ML models; using Amazon AWS Elastic Container Registry (ECR), AWS Elastic Container Service (ECS) and deployed into both AWS Fargate Cluster and EC2 Instance/Cluster. Calculating a 3-digit credit score of an individual and the percentage of probability of default of the individual are the outcomes of the 2 ML models deployed. The implementation includes below steps: 1. Creation of a Docker File for the Python Flask Based Credit Risk Assessment Web Application with 2 Deep Learning Models 2. Created a new EC2 Ubuntu Server Instance in AWS and copied the web application project’s directories and files into the AWS Ubuntu Server using SFTP linux commands. 3. Transformation of the Docker File into a Docker Image 4. Creation of a Docker Repository in AWS using AWS ECR Service 5. Authentication the Docker User Login Credentials with AWS using AWS Command Line Interface (CLI) 6. Pushed the Web Application’s Docker Image in to AWS ECR 7. Creation of the Docker Container in the AWS using AWS ECS Service 8. Creation of the Task Definition in the AWS ECS Service linked to the Docker Container 9. Configured the ECS Service Definition, by denoting the replicas of the task definitions to be executed. Enabled Load Balancer feature to manage the incoming load of the web application’s requests and traffic into the AWS Cluster. 10. Configured the AWS Fargate Cluster to execute the service and the tasks; and deployed the docker based web application into AWS Fargate Cluster. 11. Alternately; created and configured the AWS EC2 Instance/Cluster. Created Identity and Access Management (IAM) user with role and policies. Executed the ECS tasks; and deployed the docker based web application into AWS EC2 Instance. Tools & Technologies: Python, Flask, HTML, AWS, EC2, Ubuntu Server, Linux Commands, Command Line Interface (CLI), Docker, ECR, ECS, Fargate, IAM
Tech-with-Vidhya/AWS_SageMaker_TensorFlow_Keras_CNN_Model_Fashion_MNIST
This is an AWS SageMaker TensorFlow Keras CNN Machine Learning Project.
Tech-with-Vidhya/CC-Flask-CMS-API
This project repository includes a Web-based Content (Articles) Management System/Application related to Data Science Learning and Career Journey with User Registration, Login functionalities using Python, Flask Web Framework, HTML, PostgreSQL database and Heroku Cloud Server. This application is implemented and deployed in Heroku Cloud Server.
Tech-with-Vidhya/Coursera-Deep-Learning-Specialization-2021
Notes, programming assignments and quizzes from all courses within the Coursera Deep Learning specialization offered by deeplearning.ai: (i) Neural Networks and Deep Learning; (ii) Improving Deep Neural Networks: Hyperparameter tuning, Regularization and Optimization; (iii) Structuring Machine Learning Projects; (iv) Convolutional Neural Networks;
Tech-with-Vidhya/Coursera-Deep-Learning-Specialization-2023
Contains Solutions to Deep Learning Specailization - Coursera
Tech-with-Vidhya/Coursera-Machine-Learning-Specialization-2023
Contains Solutions and Notes for the Machine Learning Specialization By Stanford University and Deeplearning.ai - Coursera (2022) by Prof. Andrew NG
Tech-with-Vidhya/ETL_Stocks_Data_Pipeline_AWS_EMR_Cluster_Hive_Tables_Dynamic_Real-time_Tableau_Dashboard
Tech-with-Vidhya/MLOps_AWS_Docker_Gunicorn_Flask_NLP_LDA_Topic_Modeling_sklearn_Framework
This is an AWS MLE and MLOps NLP LDA Topic Modeling Project.
Tech-with-Vidhya/MLOps_AWS_Kubernetes_LoadBalancing_Docker_Flask_Banking_Customers_Digital_Transformation_Classifier
This is an AWS MLE and MLOps Bank Customers Digital Transformation Project.
Tech-with-Vidhya/MLOps_AWS_Lightsail_Docker_Flask_ARCH_GARCH_Time_Series_Modeling_Statistical_Framework
This is an AWS MLE and MLOps ARCH and GARCH Time Series Forecasting Statistical Modeling Project.
Tech-with-Vidhya/MLOps_AWS_Lightsail_Docker_Flask_Gaussian_Based_Time_Series_Modeling_Framework
This is an AWS MLE and MLOps Time Series Forecasting Modeling Project.
Tech-with-Vidhya/MLOps_AWS_Lightsail_Docker_Flask_Multi-Linear_Regression_Time_Series_Modeling_sklearn_Framework
This is an AWS MLE and MLOps Time Series Forecasting Project using Multiple Linear Regression Model.
Tech-with-Vidhya/NLP_Multi-Class_Text_Classification_using_BERT_Model
Tech-with-Vidhya/NLP_Text_Classification_with_Transformers_RoBERTa_and_XLNet_Models
Tech-with-Vidhya/Technical_Assignment_VidhyalakshmiParthasarathy
Tech-with-Vidhya/advanced-data-engineering-with-databricks
Tech-with-Vidhya/apache-spark-programming-with-databricks
Tech-with-Vidhya/cli-demo
Public resources for Databricks CLI demo
Tech-with-Vidhya/copilot-codespaces-vscode
Develop with AI-powered code suggestions using GitHub Copilot and VS Code
Tech-with-Vidhya/Coursera-Deep-Learning-Specialization-2021-Other
This repo contains the updated version of all the assignments/labs (done by me) of Deep Learning Specialization on Coursera by Andrew Ng. It includes building various deep learning models from scratch and implementing them for object detection, facial recognition, autonomous driving, neural machine translation, trigger word detection, etc.
Tech-with-Vidhya/data-engineering-with-databricks
Tech-with-Vidhya/NLP_Chatbot_Conversation_Engine_Using_NLTK
Tech-with-Vidhya/NLP_Multi-Class_Text_Classification_Using_RNN_LSTM
Tech-with-Vidhya/NLP_Topic_Modeling_using_K-Means_Clustering