binoydutt
I am a data enthusiast and appreciate the immense possibility data has brought into the current business scenario. I enjoy working with the challenges associat
Dallas, Texas
Pinned Repositories
Machine-Learning--Coursera
Few Machine Learning Algorithms written from scratch using Octave
Machine-Learning-Coursera
This repository contains various machine learning algorithms from scratch written in octave
Map-Reduce-Hadoop-Steaming-API
ModelBuilding-Using-SAS
Pokemon-GO-Analytics
Pokemon Go! became a very famous augmented reality (AR) game in 2016 summer. In this project, we wanted to understand the success of the mobile app game. Specifically, the purposes of this project were (1) To do web scraping using BeautifulSoup, (2) To construct a Pandas dataframe (3) To explore/visualize the numeric data using seaborn (4) To use sklearn to build machine learning models to predict the app’s review counts using Linear, Ridge, Lasso, Elastic Net regression optimizing vairous hyper parameters and the independent variables of the model (5) Analyze the app’s screenshot images using deeplearning with tensorflow generating image tags along with their respective probabilities.
Resume-Job-Description-Matching
The purpose of this project was to defeat the current Application Tracking System used by most of the organization to filter out resumes. In order to achieve this goal I had to come up with a universal score which can help the applicant understand the current status of the match. The following steps were undertaken for this project 1) Job Descriptions were collected from Glass Door Web Site using Selenium as other scrappers failed 2) PDF resume parsing using PDF Miner 3) Creating a vector representation of each Job Description - Used word2Vec to create the vector in 300-dimensional vector space with each document represented as a list of word vectors 4) Given each word its required weights to counter few Job Description specific words to be dealt with - Used TFIDF score to get the word weights. 5) Important skill related words were given higher weights and overall mean of each Job description was obtained using the product for word vector and its TFIDF scores 6) Cosine Similarity was used get the similarities of the Job Description and the Resume 7) Various Natural Language Processing Techniques were identified to suggest on the improvements in the resume that could help increase the match score
Social-Analysis-Project
Project is to analyze people’s sentiment and topics about the new administration. Used Twitter API to collect tweets about President Trump. Conduct sentiment analysis to measure how positive or negative the collected tweets are, which can be an indirect measure of President Trump’s approval. Find what kinds of topics are discussed related to the new president, for which created word clouds and conduct topic modeling on the collected tweets. Compare the geographic variation in opinions, collected tweets from 5 different states and conducted the aforementioned three analyses. Finally, concluded the project by describing the insights gained based on the conducted analyses.
Spark-Machine-Learning
binoydutt's Repositories
binoydutt/Resume-Job-Description-Matching
The purpose of this project was to defeat the current Application Tracking System used by most of the organization to filter out resumes. In order to achieve this goal I had to come up with a universal score which can help the applicant understand the current status of the match. The following steps were undertaken for this project 1) Job Descriptions were collected from Glass Door Web Site using Selenium as other scrappers failed 2) PDF resume parsing using PDF Miner 3) Creating a vector representation of each Job Description - Used word2Vec to create the vector in 300-dimensional vector space with each document represented as a list of word vectors 4) Given each word its required weights to counter few Job Description specific words to be dealt with - Used TFIDF score to get the word weights. 5) Important skill related words were given higher weights and overall mean of each Job description was obtained using the product for word vector and its TFIDF scores 6) Cosine Similarity was used get the similarities of the Job Description and the Resume 7) Various Natural Language Processing Techniques were identified to suggest on the improvements in the resume that could help increase the match score
binoydutt/Pokemon-GO-Analytics
Pokemon Go! became a very famous augmented reality (AR) game in 2016 summer. In this project, we wanted to understand the success of the mobile app game. Specifically, the purposes of this project were (1) To do web scraping using BeautifulSoup, (2) To construct a Pandas dataframe (3) To explore/visualize the numeric data using seaborn (4) To use sklearn to build machine learning models to predict the app’s review counts using Linear, Ridge, Lasso, Elastic Net regression optimizing vairous hyper parameters and the independent variables of the model (5) Analyze the app’s screenshot images using deeplearning with tensorflow generating image tags along with their respective probabilities.
binoydutt/Machine-Learning--Coursera
Few Machine Learning Algorithms written from scratch using Octave
binoydutt/Machine-Learning-Coursera
This repository contains various machine learning algorithms from scratch written in octave
binoydutt/Map-Reduce-Hadoop-Steaming-API
binoydutt/ModelBuilding-Using-SAS
binoydutt/Social-Analysis-Project
Project is to analyze people’s sentiment and topics about the new administration. Used Twitter API to collect tweets about President Trump. Conduct sentiment analysis to measure how positive or negative the collected tweets are, which can be an indirect measure of President Trump’s approval. Find what kinds of topics are discussed related to the new president, for which created word clouds and conduct topic modeling on the collected tweets. Compare the geographic variation in opinions, collected tweets from 5 different states and conducted the aforementioned three analyses. Finally, concluded the project by describing the insights gained based on the conducted analyses.
binoydutt/Spark-Machine-Learning