rajeshmore1
Data Scientist / Machine Learning Engineer Having Experience In Machine Learning And Deep Learning
Pune
Pinned Repositories
Capstone-Project-2
Corona Virus Sentiment Analysis.This challenge asks you to build a classification model to predict the sentiment of COVID-19 tweets.The tweets have been pulled from Twitter and manual tagging has been done then.
DataScience_Mentorship
Course Material - Data Science Program
GCP-Certification-Professional-Machine-Learning-Engineer
Notes For Reference
icc-data-analysis
Analyzing T20 match data
Loan-Default-Prediction
Numerous companies from financial indutry often invest considerable resources to improve their predictive models with the aim of having better insights into their customers. Such an interest in model improvement has intensified in recent years mostly because of fast development of machine learning and artificial intelligence. For standard lending institution default predictive model with high performance helps to considerably minimize Credit Loss, resulting in higher revenue and profits. Usually the better predictive model the more efficient is the underwriting policy and collection process. A well-functioning model should distinguish creditworthy customers from those that are credit risks. Often, more-predictive credit-decisioning model can identify a greater number of customers within an institution’s specified risk tolerance, which should expand revenues as well. In this project the goal is to increase detection of defaulted loans before the loan is issued/offered by P2P lending company - Lending Club. Peer-to-peer lending differs from traditional financial institutions like banks or commercial lending companies. So, Lending Club is a mediator between investors and borrowers, earning money by charging both. The main Lending Club interest is to attract more clients and maintain protfolio size. The motivation of borrowers is clear, they want to find as cheap capital as possible, so they're seeking for the best offer at the market, which is available for them. In case of investors the motivation is obvious as well. Investors look for high ROI (return of investments), but remembering that returns are proportional to risks, we may formalize saying, that investors look for appropriate returns/risks ratio. If investors experience losses it may cause churn rate growth. The underwriting process for Lending Club looks like this. Borrower applies for the loan, then if he/she meets all the basic requirements - Lending Club using their scoring model assigns client to respective grade. There are 7 grades and 35 sub-grades. Interest rate is dependent on sub-grade. After that, Lending Club gives access to the loan for investors with information about the loan and the borrower (incl. grade and sub-grade) and investors decide whether or not to invest money in this loan. The lower the grade the higher the interest rate, which means, that investors may take higher risks to gain potentially higher returns. Seeking for default rate reduction we can end up with too restrictive underwriting policy which does not neccessary correlate with higher ROI for investors, because we'll not let investors choose risky loans, which means lower interests. For Lending Club it probably means the loss of investors with high risk appetite and borrowers with weak credit history, or in case of Lending Club those who need higher loan amount.
Natural-Language-Processing-Course-
Author: Rajesh More
Predict-whether-a-startup-will-get-funded-in-the-next-three-months.
There has been a staggering growth in investments in young age startups in the last 5 years. A lot of big VC firms are increasingly getting interested in the startup funding space. You are given a task to predict whether a startup will get a funding in the next three months using app traction data and startup details. This funding can be either seed funding, Series A, Series B, so on and so forth.
Semiconductor-manufacturing-process
CONTEXT: A complex modern semiconductor manufacturing process is normally under constant surveillance via the monitoring of signals/ variables collected from sensors and or process measurement points. However, not all of these signals are equally valuable in a specific monitoring system. The measured signals contain a combination of useful information, irrelevant information as well as noise. Engineers typically have a much larger number of signals than are actually required. If we consider each type of signal as a feature, then feature selection may be applied to identify the most relevant signals. The Process Engineers may then use these signals to determine key factors contributing to yield excursions downstream in the process. This will enable an increase in process throughput, decreased time to learning and reduce the per unit production costs. These signals can be used as features to predict the yield type. And by analysing and trying out different combinations of features, essential signals that are impacting the yield type can be identified. • DATA DESCRIPTION: sensor-data.csv : (1567, 592) The data consists of 1567 examples each with 591 features. The dataset presented in this case represents a selection of such features where each example represents a single production entity with associated measured features and the labels represent a simple pass/fail yield for in house line testing. Target column “ –1” corresponds to a pass and “1” corresponds to a fail and the data time stamp is for that specific test point. • PROJECT OBJECTIVE: We will build a classifier to predict the Pass/Fail yield of a particular process entity and analyse whether all the features are required to build the model or not
Statmike-Vertex-AI-Repo
https://github.com/statmike/vertex-ai-mlops.git
unsupervised-topic-modelling-of-unlabeled-text-descriptions
rajeshmore1's Repositories
rajeshmore1/Natural-Language-Processing-Course-
Author: Rajesh More
rajeshmore1/Semiconductor-manufacturing-process
CONTEXT: A complex modern semiconductor manufacturing process is normally under constant surveillance via the monitoring of signals/ variables collected from sensors and or process measurement points. However, not all of these signals are equally valuable in a specific monitoring system. The measured signals contain a combination of useful information, irrelevant information as well as noise. Engineers typically have a much larger number of signals than are actually required. If we consider each type of signal as a feature, then feature selection may be applied to identify the most relevant signals. The Process Engineers may then use these signals to determine key factors contributing to yield excursions downstream in the process. This will enable an increase in process throughput, decreased time to learning and reduce the per unit production costs. These signals can be used as features to predict the yield type. And by analysing and trying out different combinations of features, essential signals that are impacting the yield type can be identified. • DATA DESCRIPTION: sensor-data.csv : (1567, 592) The data consists of 1567 examples each with 591 features. The dataset presented in this case represents a selection of such features where each example represents a single production entity with associated measured features and the labels represent a simple pass/fail yield for in house line testing. Target column “ –1” corresponds to a pass and “1” corresponds to a fail and the data time stamp is for that specific test point. • PROJECT OBJECTIVE: We will build a classifier to predict the Pass/Fail yield of a particular process entity and analyse whether all the features are required to build the model or not
rajeshmore1/DataScience_Mentorship_Assignments
Assignments for Students
rajeshmore1/Multiclass-Classification-Random-Forest
Sensors Data.
rajeshmore1/Neural-Network-Regression-Signal-Strength-Data
DOMAIN: Electronics and Telecommunication • CONTEXT: A communications equipment manufacturing company has a product which is responsible for emitting informative signals. Company wants to build a machine learning model which can help the company to predict the equipment’s signal quality using various parameters. • DATA DESCRIPTION: The data set contains information on various signal tests performed: 1. Parameters: Various measurable signal parameters. 2. Signal_Quality: Final signal strength or quality • PROJECT OBJECTIVE: The need is to build a regressor which can use these parameters to determine the signal strength or quality [as number]. Steps and tasks: [ Total Score: 10 points] 1. Import data. 2. Data analysis & visualisation • Perform relevant and detailed statistical analysis on the data. • Perform relevant and detailed uni, bi and multi variate analysis. Hint: Use your best analytical approach. Even you can mix match columns to create new columns which can be used for better analysis. Create your own features if required. Be highly experimental and analytical here to find relevant hidden patterns. 3. Design, train, tune and test a neural network regressor. Hint: Use best approach to refine and tune the data or the model. Be highly experimental here. 4. Pickle the model for future use.
rajeshmore1/Python-Basics
Basic Python Tutorial (Test)
rajeshmore1/Databricks-Academy-Spark
rajeshmore1/GCP-Certification-Professional-Machine-Learning-Engineer
Notes For Reference
rajeshmore1/gh-repo-clone-rajeshmore1-Spaceflight-Project-Using-Kedro
rajeshmore1/Hands-On-Natural-Language-Processing
rajeshmore1/Kedro-Space-Flights-Using-Jupyter
Random Forest regressor added at the end of evaluation node.
rajeshmore1/ML-Engineering-Crash-Course
rajeshmore1/PySpark-Tutorial
rajeshmore1/Python_Basics_Test
rajeshmore1/Recommender-System-By-Prasanna-Venkatesh
rajeshmore1/1729-AI-Conference-Fractal-AnalyicsVidhya_Event-
rajeshmore1/A-B-Testing-Detailed-Analysis
rajeshmore1/GCP-Data-Analyst-Training
rajeshmore1/Machine-Learning-Interpretability
Model Explainability
rajeshmore1/portfolio
rajeshmore1/Python-Training-Data-Axle
40 Hourse Training from 6 December To 17 December
rajeshmore1/Question-Answer-System-and-Text-Summarization
rajeshmore1/Recommender-System-ResSys-Training
Learning objectives Top-N recommender architectures Types of recommenders Python basics for working with recommenders Evaluating recommender systems Measuring your recommender Reviewing a recommender engine framework Content-based filtering Neighborhood-based collaborative filtering Matrix factorization methods Deep learning basics Applying deep learning to recommendations Scaling with Apache Spark, Amazon DSSTNE, and AWS SageMaker Real-world challenges and solutions with recommender systems Case studies from YouTube and Netflix Building hybrid, ensemble recommenders
rajeshmore1/Spaceflight-Project-Using-Kedro
rajeshmore1/SSMS-Alternative-On-MAcOS
In this repo, I am providing step by step procedure to access SQL server databases from Mac. Necessary Softwares needed are Docker and Azure Data Studio.
rajeshmore1/test-repo
testing repository upload from local
rajeshmore1/Vehicle-Data--PCA-and-SVM
rajeshmore1/Vertex-AI-with-Colab
rajeshmore1/Walter-Pitts-Squadron
Covered basic Python Topics like List, Tuple
rajeshmore1/Face_Recognition_Drowsiness_Detection