rishabhpahuja
I am a 2023 master's graduate from Carnegie Mellon University with specialization in deep learning and computer vision
GenthermFarmington Hills, MI
Pinned Repositories
Apple-Tracking
This software pipeline can detect and track apples in real-time in an orchard, while effectively dealing with cases of apple occlusion. Apples are detected and segmented using a deep learning based approach and are tracked in world coordinate frame using Extended Kalman Filter
Attention-is-All-You-Need
Implementation of multi-headed self-attention
Credit-Score-
With the increasing demand for credit facilitated by technologies such as the credit card, modern day credit evaluation systems cannot rely on judgmental approaches that subjectively evaluate every individual. It is important to develop objective methods that are unbiased and can reliably and quickly give an overview of someone’s creditworthiness without actually pulling up their credit report. Furthermore, low-income and young individuals often do not have a credit score due to multiple factors such as lack of income history or improper documentation. In the US alone there are over 25 million Americans who were credit invisible in 2015 according to the CFPB [1]. Having such a system will allow the financial institutions to quickly judge the risk associated with lending to an individual. It will also give the individual an opportunity to monitor their credit score and take steps to improve it. Hence, we plan to build a credit score predictor to tackle this challenge. We will be using the data provided in [2] to build and test our models. The biggest difference between our methodology and those of typical credit score calculators, such as FICO, is that we plan to consider the parameters that any person can furnish irrespective of their income and credit history. In this framework our input data X will be parameters identifying the person’s financial status such as their current income, marital status, number of dependents, real estate ownership amongst others. The output data Y will be the predicted credit score of the individual. As our data set does not have predefined labels we will try to correlate our feature vectors to identify a pseudo label vector that can effectively distinguish between individuals with good and bad credit. We also plan to test out skewed subsets of our data that have a large number of only one type of label to see which method works best for such datasets. We also plan to try out different feature selection methods such as F-score, which tries to identify which feature is the best in terms of separating the data, and if time permits more sophisticated methods such as genetic algorithms. Several works show that correct feature selection greatly improves efficiency of the model [3-7]. Hence we believe that devoting resources to finding the optimum set of features will help to increase the efficiency of the model that we create. In the project, we shall also try to tackle the issue of imbalanced dataset since many literature reviews have shown that certain models cannot perform well due to imbalance in the dataset [8]. We shall be using several models to predict the credit score. We shall be using linear regression, logistic regression, Naïve Bayes, and some machine learning models, including k-NearestNeighbor (k-NN), Decision Trees (DTs), Support Vector Machines(SVMs). We expect a regression plot for the regression models, and probability curves for classification models. To evaluate our different models, we plan to use metrics such as Percentage Correctly Classified (PCC), Sensitivity/Recall, Type I Error, and Type II Error, and Receiver operating Characteristics amongst others [9]. Finally we hope to build a kit that is able to identify the optimum algorithm and its parameters based on the input data provided. This will help to automate the entire process for new sets of data and allow such an approach to be used with real problems. The members of our team are Aditya Rathi, Rishabh Pahuja and Vatsal Joshi.
dcase-few-shot-bioacoustic
pytorch-3dunet_orignal
3D U-Net model for volumetric semantic segmentation written in pytorch
rishabhpahuja
Config files for my GitHub profile.
Wombat
This repository consists of two major components for a project called E-waste recycling system. One component is used to generate a labelled dataset to train U-Net segmentation network. The second component is used to perform non-rigid registration using Demon's algorithm.
SymbolLearning
The main repository for the project, to learn effects and observation symbols from self-supervised robot interaction in simulation
rishabhpahuja's Repositories
rishabhpahuja/Apple-Tracking
This software pipeline can detect and track apples in real-time in an orchard, while effectively dealing with cases of apple occlusion. Apples are detected and segmented using a deep learning based approach and are tracked in world coordinate frame using Extended Kalman Filter
rishabhpahuja/Attention-is-All-You-Need
Implementation of multi-headed self-attention
rishabhpahuja/Credit-Score-
With the increasing demand for credit facilitated by technologies such as the credit card, modern day credit evaluation systems cannot rely on judgmental approaches that subjectively evaluate every individual. It is important to develop objective methods that are unbiased and can reliably and quickly give an overview of someone’s creditworthiness without actually pulling up their credit report. Furthermore, low-income and young individuals often do not have a credit score due to multiple factors such as lack of income history or improper documentation. In the US alone there are over 25 million Americans who were credit invisible in 2015 according to the CFPB [1]. Having such a system will allow the financial institutions to quickly judge the risk associated with lending to an individual. It will also give the individual an opportunity to monitor their credit score and take steps to improve it. Hence, we plan to build a credit score predictor to tackle this challenge. We will be using the data provided in [2] to build and test our models. The biggest difference between our methodology and those of typical credit score calculators, such as FICO, is that we plan to consider the parameters that any person can furnish irrespective of their income and credit history. In this framework our input data X will be parameters identifying the person’s financial status such as their current income, marital status, number of dependents, real estate ownership amongst others. The output data Y will be the predicted credit score of the individual. As our data set does not have predefined labels we will try to correlate our feature vectors to identify a pseudo label vector that can effectively distinguish between individuals with good and bad credit. We also plan to test out skewed subsets of our data that have a large number of only one type of label to see which method works best for such datasets. We also plan to try out different feature selection methods such as F-score, which tries to identify which feature is the best in terms of separating the data, and if time permits more sophisticated methods such as genetic algorithms. Several works show that correct feature selection greatly improves efficiency of the model [3-7]. Hence we believe that devoting resources to finding the optimum set of features will help to increase the efficiency of the model that we create. In the project, we shall also try to tackle the issue of imbalanced dataset since many literature reviews have shown that certain models cannot perform well due to imbalance in the dataset [8]. We shall be using several models to predict the credit score. We shall be using linear regression, logistic regression, Naïve Bayes, and some machine learning models, including k-NearestNeighbor (k-NN), Decision Trees (DTs), Support Vector Machines(SVMs). We expect a regression plot for the regression models, and probability curves for classification models. To evaluate our different models, we plan to use metrics such as Percentage Correctly Classified (PCC), Sensitivity/Recall, Type I Error, and Type II Error, and Receiver operating Characteristics amongst others [9]. Finally we hope to build a kit that is able to identify the optimum algorithm and its parameters based on the input data provided. This will help to automate the entire process for new sets of data and allow such an approach to be used with real problems. The members of our team are Aditya Rathi, Rishabh Pahuja and Vatsal Joshi.
rishabhpahuja/Wombat
This repository consists of two major components for a project called E-waste recycling system. One component is used to generate a labelled dataset to train U-Net segmentation network. The second component is used to perform non-rigid registration using Demon's algorithm.
rishabhpahuja/dcase-few-shot-bioacoustic
rishabhpahuja/pytorch-3dunet_orignal
3D U-Net model for volumetric semantic segmentation written in pytorch
rishabhpahuja/rishabhpahuja
Config files for my GitHub profile.