sherkhan15
Cognitive science| Behavioural Science|Artificial Intelligence|Research Developer DRDO | Member of Google Developer Society| Data Engineer
IFI tech SolutionsNew Delhi
Pinned Repositories
Algorithms
Searching and Sorting Algorithms .
distarctor-Detection_Eyes_Emotions_CI
The model uses the computer webcam to calculate a CI based on a composite score of emotional state and eye/head movement and displays a real-time Engagement and Emotional Classification.
EDGAR-SEC-REPORTS-ANALYSIS
Data from EDGAR filling was extracted and text analysis was performed.
Forest-Cover-Predictor-with-ANN-SIGMOID-VS-RELU-
Context This dataset contains tree observations from four areas of the Roosevelt National Forest in Colorado. All observations are cartographic variables (no remote sensing) from 30 meter x 30 meter sections of forest. There are over half a million measurements total! Content This dataset includes information on tree type, shadow coverage, distance to nearby landmarks (roads etcetera), soil type, and local topography. Acknowledgement This dataset is part of the UCI Machine Learning Repository, and the original source can be found here. The original database owners are Jock A. Blackard, Dr. Denis J. Dean, and Dr. Charles W. Anderson of the Remote Sensing and GIS Program at Colorado State University. Inspiration Can you build a model that predicts what types of trees grow in an area based on the surrounding characteristics? A past Kaggle competition project on this topic can be found here. What kinds of trees are most common in the Roosevelt National Forest? Which tree types can grow in more diverse environments? Are there certain tree types that are sensitive to an environmental factor, such as elevation or soil type?
K-Means-Project
K Means Clustering Project Using KMeans Clustering to cluster Universities into to two groups: Private and Public. The Data Data frame has 777 observations on the following 18 variables. Private A factor with levels No and Yes indicating private or public university Apps Number of applications received Accept Number of applications accepted Enroll Number of new students enrolled Top10perc Pct. new students from top 10% of H.S. class Top25perc Pct. new students from top 25% of H.S. class F.Undergrad Number of fulltime undergraduates P.Undergrad Number of parttime undergraduates Outstate Out-of-state tuition Room.Board Room and board costs Books Estimated book costs Personal Estimated personal spending PhD Pct. of faculty with Ph.D.’s Terminal Pct. of faculty with terminal degree S.F.Ratio Student/faculty ratio perc.alumni Pct. alumni who donate Expend Instructional expenditure per student Grad.Rate Graduation rate
Random-Forest-Project-Lender-Score-
For this project we will be exploring publicly available data from LendingClub.com. Lending Club connects people who need money (borrowers) with people who have money (investors). Hopefully, as an investor you would want to invest in people who showed a profile of having a high probability of paying you back. We will try to create a model that will help predict this. We will use lending data from 2007-2010 and try to classify and predict whether or not the borrower paid back their loan in full. Here are what the columns represent: credit.policy: 1 if the customer meets the credit underwriting criteria of LendingClub.com, and 0 otherwise. purpose: The purpose of the loan (takes values "credit_card", "debt_consolidation", "educational", "major_purchase", "small_business", and "all_other"). int.rate: The interest rate of the loan, as a proportion (a rate of 11% would be stored as 0.11). Borrowers judged by LendingClub.com to be more risky are assigned higher interest rates. installment: The monthly installments owed by the borrower if the loan is funded. log.annual.inc: The natural log of the self-reported annual income of the borrower. dti: The debt-to-income ratio of the borrower (amount of debt divided by annual income). fico: The FICO credit score of the borrower. days.with.cr.line: The number of days the borrower has had a credit line. revol.bal: The borrower's revolving balance (amount unpaid at the end of the credit card billing cycle). revol.util: The borrower's revolving line utilization rate (the amount of the credit line used relative to total credit available). inq.last.6mths: The borrower's number of inquiries by creditors in the last 6 months. delinq.2yrs: The number of times the borrower had been 30+ days past due on a payment in the past 2 years. pub.rec: The borrower's number of derogatory public records (bankruptcy filings, tax liens, or judgments).
Sleeping-EEG-Analysis
A 30-seconds extract of real slow-wave sleep from one young individual. The sampling frequency is 100 Hz and the channel is F3
STUDENT-PERFORMACE-ANALYSIS-SYSTEM
STUDENT PERFORMACE ANALYSIS SYSTEM We consider a real time application, a warehouse that functions within a time frame that the users senses as immediate or current for the top management to analysis the student academic performance in their institutions. A large number of academic institutions work on operational database for their day to day update. This system fully satisfies the complex quality requests of OLTP system, but it also shows significant OLAP failures. Data are not adequately prepared for complex report forming. The system uses operational database that can’t provide broad range of possibilities for creating complex reports. Operational database does not have special tools for creating queries that are defined by users. The significant benefit from this solution of information and knowledge retrieval in databases is that the user does not need to possess knowledge concerning the relational model and the complex query languages. This approach in data analysis becomes more and more popular because it enables OLTP systems to get optimized for their purpose and to transfer data analysis to OLAP systems. The information from the system of academics institutions can be rapidly assessed to find the performance of students in that institution. The data & information gained from the system can be use as a substantial indicator for monitoring of the potential failure & improvements. Furthermore, ➢ Alerts can be sent to the parent & academic staff to intimate them about the performance of the student. ➢ Counselling can be given to students who struggle with their performances before they lose their grounds. ➢ Insights can be sense in future, so students benefited themselves to avail the advantages in placements.
Twitter-Sentiment-Analysis
Predicting the sentiments of the live tweets related to the users entered words .
sherkhan15's Repositories
sherkhan15/distarctor-Detection_Eyes_Emotions_CI
The model uses the computer webcam to calculate a CI based on a composite score of emotional state and eye/head movement and displays a real-time Engagement and Emotional Classification.
sherkhan15/EDGAR-SEC-REPORTS-ANALYSIS
Data from EDGAR filling was extracted and text analysis was performed.
sherkhan15/K-Means-Project
K Means Clustering Project Using KMeans Clustering to cluster Universities into to two groups: Private and Public. The Data Data frame has 777 observations on the following 18 variables. Private A factor with levels No and Yes indicating private or public university Apps Number of applications received Accept Number of applications accepted Enroll Number of new students enrolled Top10perc Pct. new students from top 10% of H.S. class Top25perc Pct. new students from top 25% of H.S. class F.Undergrad Number of fulltime undergraduates P.Undergrad Number of parttime undergraduates Outstate Out-of-state tuition Room.Board Room and board costs Books Estimated book costs Personal Estimated personal spending PhD Pct. of faculty with Ph.D.’s Terminal Pct. of faculty with terminal degree S.F.Ratio Student/faculty ratio perc.alumni Pct. alumni who donate Expend Instructional expenditure per student Grad.Rate Graduation rate
sherkhan15/Random-Forest-Project-Lender-Score-
For this project we will be exploring publicly available data from LendingClub.com. Lending Club connects people who need money (borrowers) with people who have money (investors). Hopefully, as an investor you would want to invest in people who showed a profile of having a high probability of paying you back. We will try to create a model that will help predict this. We will use lending data from 2007-2010 and try to classify and predict whether or not the borrower paid back their loan in full. Here are what the columns represent: credit.policy: 1 if the customer meets the credit underwriting criteria of LendingClub.com, and 0 otherwise. purpose: The purpose of the loan (takes values "credit_card", "debt_consolidation", "educational", "major_purchase", "small_business", and "all_other"). int.rate: The interest rate of the loan, as a proportion (a rate of 11% would be stored as 0.11). Borrowers judged by LendingClub.com to be more risky are assigned higher interest rates. installment: The monthly installments owed by the borrower if the loan is funded. log.annual.inc: The natural log of the self-reported annual income of the borrower. dti: The debt-to-income ratio of the borrower (amount of debt divided by annual income). fico: The FICO credit score of the borrower. days.with.cr.line: The number of days the borrower has had a credit line. revol.bal: The borrower's revolving balance (amount unpaid at the end of the credit card billing cycle). revol.util: The borrower's revolving line utilization rate (the amount of the credit line used relative to total credit available). inq.last.6mths: The borrower's number of inquiries by creditors in the last 6 months. delinq.2yrs: The number of times the borrower had been 30+ days past due on a payment in the past 2 years. pub.rec: The borrower's number of derogatory public records (bankruptcy filings, tax liens, or judgments).
sherkhan15/Algorithms
Searching and Sorting Algorithms .
sherkhan15/Face-Recognization
sherkhan15/Forest-Cover-Predictor-with-ANN-SIGMOID-VS-RELU-
Context This dataset contains tree observations from four areas of the Roosevelt National Forest in Colorado. All observations are cartographic variables (no remote sensing) from 30 meter x 30 meter sections of forest. There are over half a million measurements total! Content This dataset includes information on tree type, shadow coverage, distance to nearby landmarks (roads etcetera), soil type, and local topography. Acknowledgement This dataset is part of the UCI Machine Learning Repository, and the original source can be found here. The original database owners are Jock A. Blackard, Dr. Denis J. Dean, and Dr. Charles W. Anderson of the Remote Sensing and GIS Program at Colorado State University. Inspiration Can you build a model that predicts what types of trees grow in an area based on the surrounding characteristics? A past Kaggle competition project on this topic can be found here. What kinds of trees are most common in the Roosevelt National Forest? Which tree types can grow in more diverse environments? Are there certain tree types that are sensitive to an environmental factor, such as elevation or soil type?
sherkhan15/Predicting_house_prices_with_ames_dataset
This projected aimed to estimate the sale price of properties based on their "fixed" characteristics, such as neighborhood, plot size, number of stories, etc. In second place, I tried to estimate the value of possible changes and renovations to properties from the variation in sale price not explained by the fixed characteristics.
sherkhan15/python-docx
Create and modify Word documents with Python
sherkhan15/Shred---Build
Allow user to select fitness trainers & nutritionist according to thier need.
sherkhan15/Spam-Filter-Analysis
When I finished the theoretical part, I wanted to try implementing some practical and real world example. I found it hard to begin since I didn’t know how to start. One of the simplest projects to start with was building a Spam Filter. So now we are going to start from the bottom with real email messages and have them classified as spam and non-spam. The dataset that i'm going to use is a preprocessed subset of the Ling-Spam Dataset.
sherkhan15/Yelp-Business-Rating-Prediction
Classifying reviews based on its text into star rating from 1 to 5.
sherkhan15/Sleeping-EEG-Analysis
A 30-seconds extract of real slow-wave sleep from one young individual. The sampling frequency is 100 Hz and the channel is F3
sherkhan15/Twitter-Sentiment-Analysis
Predicting the sentiments of the live tweets related to the users entered words .
sherkhan15/3D-Machine-Learning
A resource repository for 3D machine learning
sherkhan15/Azure-Data-Factory_-Use-cases
Working on ETL & ELT pipelines using Azure Data Factory for Batch & streaming flow of Structure & Un-structured data.
sherkhan15/DataScienceTools
Useful Data Science and Machine Learning Tools,Libraries and Packages
sherkhan15/delta
An open-source storage layer that brings scalable, ACID transactions to Apache Spark™ and big data workloads.
sherkhan15/E-Commerce_Customers-with-simple-linear-regression
sherkhan15/E2ESynapseDemo
sherkhan15/Event-Producer-Azure-Event-App
Creating a C# App with Visual Studio to produce Events and through that in Azure Event App.
sherkhan15/Event-Producer-to-Cosmos-Db---Event-Hub-through-Azure-Function
This project will Produce Events to Azure Cosmos Db & then Capture the Same in Azure Event Hub through Azure Function
sherkhan15/harvest_api_samples
Samples of Harvest API usage in various languages.
sherkhan15/lets-stop-wildfires-hackathon
Together We Can Make Things Happen !
sherkhan15/nn-from-scratch
Implementing a Neural Network from Scratch
sherkhan15/Predicting-Employee-Attrition-using-Machine-Learning
In this project, I created a model that predicts whether an employee will leave his employment or not. The model is dependent on some features such as age, monthly income, number of years worked, gender etc. The data was sourced from Kaggle and the type of machine learnig used is Supervised machine learning. ForestTreeClassifier was used for the machine learning model and all codes are in python.
sherkhan15/PRML
PRML algorithms implemented in Python
sherkhan15/selling-partner-api-models
This repository contains OpenAPI models for developers to use when developing software to call Selling Partner APIs.
sherkhan15/TDD--EEG-SIgnals-
Target Destruction Descriptor- EEG
sherkhan15/tech-talks
This repository contains the notebooks and presentations we use for our Databricks Tech Talks