data-exploration-and-preprocessing
There are 21 repositories under data-exploration-and-preprocessing topic.
AI-Northstar-Tech/vector-io
The only Vector tooling you'll need. Star the repo and look out for an email to try out a brand new Vector Data Exploration demo! Use the universal VDF format for vector datasets to easily export and import data from all vector databases, and re-embed it using any model
SayamAlt/Company-Bankruptcy-Prediction
Successfully developed a machine learning model which can accurately predict whether a firm will become bankrupt or not, depending on various features such as net value growth rate, borrowing dependency, cash/total assets, etc.
nafisalawalidris/Employee-Attrition-Control
The Employee Attrition Control project uses data analysis and predictive modeling to understand and address employee turnover. It provides insights and recommendations to reduce attrition and improve employee satisfaction and retention.
andysontran/health-factors-ml-pred
CHL5230 - Applied Machine Learning for Health Data (Fall 2023) @ University of Toronto: Datathon #1 - Lifestyle and Health Factors
AngelX62/Data-Science-Job-Clean
Data was downloaded through Kaggle
andysontran/icu-mortality-ml-pred
CHL5230 - Applied Machine Learning for Health Data (Fall 2023) @ University of Toronto: Datathon #4 - ICU Mortality Prediction Using ML
andysontran/mhealth-wearable-ml-pred
CHL5230 - Applied Machine Learning for Health Data (Fall 2023) @ University of Toronto: Datathon #5 - mHealth and ML
andysontran/stroke-ml-pred
CHL5230 - Applied Machine Learning for Health Data (Fall 2023) @ University of Toronto: Datathon #2 - Early Prediction of Heart Failure
Aniket2021448/Movie-recommender-system
A Machine Learning Project implemented from scratch which involves web scraping, data engineering, exploratory data analysis, NLP processing and ML, achieving the functionality of a Content based movie recommender system
saikaryekar/PySpark-Plane-Dataset-Exploration
Explored a dataset of planes while learning PySpark commands.
venkat-a/Happiness-Prediction
Prediction of happy Customers based on Happiness Survey Data
SaiSurajMatta/Covid-19-Data-Exploration-Project
An SQL-based exploration of COVID-19 data and vaccination progress using the Covid-Deaths dataset for insights into global pandemic trends.
SayamAlt/Bank-Customer-Churn-Prediction-using-PySpark
Successfully established a machine learning model using PySpark which can accurately classify whether a bank customer will churn or not up to an accuracy of more than 86% on the test set.
SayamAlt/Credit-Card-Approval-Prediction
Successfully developed a machine learning model which can accurately predict up to 100% accuracy whether a credit card application of a given applicant would be approved or not, based on several demographic features such as applicant age, total income, marital status, total years of work experience, etc.
SayamAlt/Employee-Attrition-Prediction
Successfully established a machine learning model which can accurately predict whether an employee of a given company will leave it in the impending future or not, based on several employee details and employment metrics.
SayamAlt/Financial-News-Sentiment-Analysis
Successfully developed a fine-tuned DistilBERT transformer model which can accurately predict the overall sentiment of a piece of financial news up to an accuracy of nearly 81.5%.
SayamAlt/Global-News-Headlines-Text-Summarization
Successfully established a text summarization model using Seq2Seq modeling with Luong Attention, which can give a short and concise summary of the global news headlines.
SayamAlt/Symptoms-Disease-Text-Classification
Successfully developed a fine-tuned BERT transformer model which can accurately classify symptoms to their corresponding diseases upto an accuracy of 89%.
SayamAlt/Taxi-Trip-Fare-Prediction
Successfully created a machine learning model which can accurately predict the fare of a taxi trip based on several features such as trip duration, tip amount, etc.
srosalino/Prediction_of_Seoul_Bikes_Demand
The objective of this project is to predict the number of bicycles needed to be made available each hour in order to make the service as efficient as possible