TejasSutar01

AIML Developer

Searching Pune

Pinned Repositories

AIRLINES_H_CLUSTERING
Perform clustering (Both hierarchical and K means clustering) for the airlines data to obtain optimum number of clusters. Draw the inferences from the clusters obtained.
Language:Python1 2 00
AIRLINES_KMEANS_CLUSTERING
Perform clustering (Both hierarchical and K means clustering) for the airlines data to obtain optimum number of clusters. Draw the inferences from the clusters obtained.
Language:Python1 3 01
Car-Price-Prediction
The objective of this analysis is to provide a reliable regression model to predict the price of a car based on the variables provided as accurately as possible. The idea is for this to be used in the future for any new cars that would added to the dataset going forward.
Language:Jupyter Notebook1 1 00
CRIME_CLUSTERING
Perform Clustering for the crime data and identify the number of clusters formed and draw inferences.
Language:Python1 2 00
DECISION_TREE_COMPANY
About the data: Let’s consider a Company dataset with around 10 variables and 400 records. The attributes are as follows:  Sales -- Unit sales (in thousands) at each location  Competitor Price -- Price charged by competitor at each location  Income -- Community income level (in thousands of dollars)  Advertising -- Local advertising budget for company at each location (in thousands of dollars)  Population -- Population size in region (in thousands)  Price -- Price company charges for car seats at each site  Shelf Location at stores -- A factor with levels Bad, Good and Medium indicating the quality of the shelving location for the car seats at each site  Age -- Average age of the local population  Education -- Education level at each location  Urban -- A factor with levels No and Yes to indicate whether the store is in an urban or rural location  US -- A factor with levels No and Yes to indicate whether the store is in the US or not The company dataset looks like this: Problem Statement: A cloth manufacturing company is interested to know about the segment or attributes causes high sale. Approach - A decision tree can be built with target variable Sale (we will first convert it in categorical variable) & all other variable will be independent in the analysis.
Language:Python1 2 01
DECISION_TREE_FRAUD_DATA
Use decision trees to prepare a model on fraud data treating those who have taxable_income <= 30000 as "Risky" and others are "Good"
Language:Python1 2 00
Forecast-for-PM2.5
Forecast for PM2.5
Language:Jupyter Notebook1 2 00
FORECASTING_AIRLINES_DATA
Forecast the Airlines Passengers data set. Prepare a document for each model explaining how many dummy variables you have created and RMSE value for each model. Finally which model you will use for Forecasting.
Language:Python1 2 00
FORECASTING_COCACOLA_SALES
Forecast the CocaCola prices data set. Prepare a document for each model explaining how many dummy variables you have created and RMSE value for each model. Finally which model you will use for Forecasting.
Language:Python1 2 00
Sentiment-Analysis
Need to get daily analysis of product and extract the sentiments, emotions etc. using Amazon data and correlate it with NSE or BSE stock market over past 3 months.
Language:Python2 2 00

TejasSutar01's Repositories

TejasSutar01/Sentiment-Analysis
Need to get daily analysis of product and extract the sentiments, emotions etc. using Amazon data and correlate it with NSE or BSE stock market over past 3 months.
Language:Python2 2 00
TejasSutar01/Car-Price-Prediction
The objective of this analysis is to provide a reliable regression model to predict the price of a car based on the variables provided as accurately as possible. The idea is for this to be used in the future for any new cars that would added to the dataset going forward.
Language:Jupyter Notebook1 1 00
TejasSutar01/DECISION_TREE_COMPANY
About the data: Let’s consider a Company dataset with around 10 variables and 400 records. The attributes are as follows:  Sales -- Unit sales (in thousands) at each location  Competitor Price -- Price charged by competitor at each location  Income -- Community income level (in thousands of dollars)  Advertising -- Local advertising budget for company at each location (in thousands of dollars)  Population -- Population size in region (in thousands)  Price -- Price company charges for car seats at each site  Shelf Location at stores -- A factor with levels Bad, Good and Medium indicating the quality of the shelving location for the car seats at each site  Age -- Average age of the local population  Education -- Education level at each location  Urban -- A factor with levels No and Yes to indicate whether the store is in an urban or rural location  US -- A factor with levels No and Yes to indicate whether the store is in the US or not The company dataset looks like this: Problem Statement: A cloth manufacturing company is interested to know about the segment or attributes causes high sale. Approach - A decision tree can be built with target variable Sale (we will first convert it in categorical variable) & all other variable will be independent in the analysis.
Language:Python1 2 01
TejasSutar01/DECISION_TREE_FRAUD_DATA
Use decision trees to prepare a model on fraud data treating those who have taxable_income <= 30000 as "Risky" and others are "Good"
Language:Python1 2 00
TejasSutar01/Forecast-for-PM2.5
Forecast for PM2.5
Language:Jupyter Notebook1 2 00
TejasSutar01/FORECASTING_AIRLINES_DATA
Forecast the Airlines Passengers data set. Prepare a document for each model explaining how many dummy variables you have created and RMSE value for each model. Finally which model you will use for Forecasting.
Language:Python1 2 00
TejasSutar01/FORECASTING_COCACOLA_SALES
Forecast the CocaCola prices data set. Prepare a document for each model explaining how many dummy variables you have created and RMSE value for each model. Finally which model you will use for Forecasting.
Language:Python1 2 00
TejasSutar01/FORECASTING_COCASALES_DATADRIVEN_MODEL
Forecast the CocaCola prices data set. Prepare a document for each model explaining how many dummy variables you have created and RMSE value for each model. Finally which model you will use for Forecasting.
1 2 0
TejasSutar01/FORECASTING_PLASTIC_SALES_DATA
Forecast the Plastic sales data set. Prepare a document for each model explaining how many dummy variables you have created and RMSE value for each model. Finally which model you will use for Forecasting.
Language:Python1 2 0
TejasSutar01/KNN_GLASS_DATA
Prepare a model for glass classification using KNN Data Description: RI : refractive index Na: Sodium (unit measurement: weight percent in corresponding oxide, as are attributes 4-10) Mg: Magnesium AI: Aluminum Si: Silicon K:Potassium Ca: Calcium Ba: Barium Fe: Iron Type: Type of glass: (class attribute) 1 -- building_windows_float_processed 2 --building_windows_non_float_processed 3 --vehicle_windows_float_processed 4 --vehicle_windows_non_float_processed (none in this database) 5 --containers 6 --tableware 7 --headlamps
Language:Python1 2 01
TejasSutar01/KNN_IRIS_DATA
Implement a KNN model to classify the species in to categories
Language:Python1 2 0
TejasSutar01/KNN_ZOO_DATA
Implement a KNN model to classify the animals in to categories
Language:Python1 2 0
TejasSutar01/LOGISTIC_REGRESSION_AFFAIRS_DATA
I have a dataset containing family information of married couples, which have around 10 variables & 600+ observations. Independent variables are ~ gender, age, years married, children, religion etc. I have one response variable which is number of extra marital affairs. Now, I want to know what all factor influence the chances of extra marital affair. Since extra marital affair is a binary variable (either a person will have or not), so we can fit logistic regression model here to predict the probability of extra marital affair. install.packages('AER') data(Affairs,package="AER")
Language:Python1 2 0
TejasSutar01/LOGISTIC_REGRESSION_BANK_DATA
Output variable -> y y -> Whether the client has subscribed a term deposit or not Binomial ("yes" or "no")
Language:Python1 2 0
TejasSutar01/Multiple-Linear-Regression_50_Startups
Prepare a prediction model for profit of 50_startups data. Do transformations for getting better predictions of profit and make a table containing R^2 value for each prepared model.
Language:Python1 2 0
TejasSutar01/Multiple-Linear-Regression_ToyotaCorolla
Consider only the below columns and prepare a prediction model for predicting Price. Corolla<-Corolla[c("Price","Age_08_04","KM","HP","cc","Doors","Gears","Quarterly_Tax","Weight")]
Language:Python1 2 0
TejasSutar01/NB_SALARY_DATA
Prepare a classification model using Naive Bayes for salary data Data
Language:Python1 2 0
TejasSutar01/NB_SMS_DATA
Build a naive Bayes model on the data set for classifying the ham and spam
Language:Python1 2 0
TejasSutar01/RANDOM_FOREST_COMPANY_DATA
A cloth manufacturing company is interested to know about the segment or attributes causes high sale. Approach - A Random Forest can be built with target variable Sales (we will first convert it in categorical variable) & all other variable will be independent in the analysis.
Language:Python1 2 0
TejasSutar01/RANDOM_FOREST_FRAUD_DATA
Use Random Forest to prepare a model on fraud data treating those who have taxable_income <= 30000 as "Risky" and others are "Good"
Language:Python1 2 0
TejasSutar01/Automatic_Candidates_CV_Recommendation
This project fetches the candidates CV present in database already and recommend proper candidates with skills.
Language:Python
TejasSutar01/Chronic_Diseases_Prediction
The data was taken over a 2-month period in India with 25 features ( eg, red blood cell count, white blood cell count, etc). The target is the 'classification', which is either 'ckd' or 'notckd' - ckd=chronic kidney disease. There are 400 rows
Language:Jupyter Notebook2 0
TejasSutar01/Classifier-Model
Worked on IRIS Data set to classify class if we feed any new data.I was able to build Decision Tree model was giving good accuracy around 91%. Decision Tree was build with “Entropy” & “Gini Index".
Language:Jupyter Notebook
TejasSutar01/GRIP_The-Sparks-Foundation_Data-Science-Internship
Task Received for spark foundation internship
Language:Jupyter Notebook2 0
TejasSutar01/Health-Insurance---JOB-A-THON---Analytics-Vidhya
Your Client FinMan is a financial services company that provides various financial services like loan, investment funds, insurance etc. to its customers. FinMan wishes to cross-sell health insurance to the existing customers who may or may not hold insurance policies with the company. The company recommend health insurance to it's customers based on their profile once these customers land on the website. Customers might browse the recommended health insurance policy and consequently fill up a form to apply. When these customers fill-up the form, their Response towards the policy is considered positive and they are classified as a lead. Once these leads are acquired, the sales advisors approach them to convert and thus the company can sell proposed health insurance to these leads in a more efficient manner. Now the company needs your help in building a model to predict whether the person will be interested in their proposed Health plan/policy given the information about: Demographics (city, age, region etc.) Information regarding holding policies of the customer Recommended Policy Information
Language:Jupyter Notebook2 0
TejasSutar01/Keras-Tuner
TejasSutar01/Marksheet_OCR
Extracting the entities from marksheet using generative AI
TejasSutar01/Pima-Indians-Diabetes-Database
Context This dataset is originally from the National Institute of Diabetes and Digestive and Kidney Diseases. The objective of the dataset is to diagnostically predict whether or not a patient has diabetes, based on certain diagnostic measurements included in the dataset. Several constraints were placed on the selection of these instances from a larger database. In particular, all patients here are females at least 21 years old of Pima Indian heritage. Content The datasets consists of several medical predictor variables and one target variable, Outcome. Predictor variables includes the number of pregnancies the patient has had, their BMI, insulin level, age, and so on. Acknowledgements Smith, J.W., Everhart, J.E., Dickson, W.C., Knowler, W.C., & Johannes, R.S. (1988). Using the ADAP learning algorithm to forecast the onset of diabetes mellitus. In Proceedings of the Symposium on Computer Applications and Medical Care (pp. 261--265). IEEE Computer Society Press. Inspiration Can you build a machine learning model to accurately predict whether or not the patients in the dataset have diabetes or not?
Language:Jupyter Notebook2 0
TejasSutar01/test
Language:HTML1 0
TejasSutar01/Twitter-Tweet-Analysis
Twitter has become an important communication channel in times of emergency. The ubiquitousness of smartphones enables people to announce an emergency they’re observing in real-time. Because of this, more agencies are interested in programmatically monitoring Twitter (i.e. disaster relief organizations and news agencies.But, it’s not always clear whether a person’s words are actually announcing a disaster
Language:Jupyter Notebook1 0