datapreparation
There are 63 repositories under datapreparation topic.
sfu-db/dataprep
Open-source low code data preparation library in python. Collect, clean and visualization your data in python with a few lines of code.
ydataai/ydata-talkdatatome
Make your dataset talk to you. The AI assistant for data preparation.
CoDS-GCS/KGFarm
A Holistic Platform for Automating Data Preparation
baharzurnaci/Machine-Learning-Zoomcamp-
This repo includes codes for ML Zoomcamp. If you can follow the tutorials from the link here: https://www.youtube.com/watch?v=rowoDjPc8HU&list=PL3MmuxUbc_hIhxl5Ji8t4O6lPAOpHaCLR 👩🏼💻
victorcouste/trifacta-flows-examples
Trifacta Flows Examples and Templates. Flows zip files, recipes and datasets.
visokio/omniscope-custom-blocks
Public repository for custom blocks for Omniscope
Ashleshk/Tableau-10-A-Z-Hands-on-Tableau-Training-for-Data-Science-Udemy
Learn data visualization through Tableau 2020 and create opportunities for you or key decision-makers to discover data patterns such as customer purchase behavior, sales trends, or production bottlenecks. This Course on Udemy
ms8909/dptron
mltrons dptron: Dirty Data in, Clean Data Out!
huseyincenik/data_science
Data Science materials
RafeyIqbalRahman/Data-Imputation-Techniques
This repository demonstrates data imputation using Scikit-Learn's SimpleImputer, KNNImputer, and IterativeImputer.
rrambhia22/Bike_Crash_Analysis
The project deals with determining and predicting the type of accident taking place in the city of Austin. The data would help in understanding what possible factors are leading to the accidents based on the severity of the incident that has occurred.
wsperger/dataprepping_generative_ai
A one stop shop for all tools to prepare datasets for generative ai
DaveChui/Data-Preparation-and-Cleaning---Geo-Data
Preparing and Cleaning Data
deepu9962/Exploratory-Analysis-of-Geolocational-Data
This project involves the use of K-Means Clustering to find the best accommodation for students in Bangalore (or any other city of your choice) by classifying accommodation for incoming students on the basis of their preferences on amenities, budget and proximity to the location.
ENGRZULQARNAIN/ScrapySub
ScrapySub is a Python library designed to recursively scrape website content, including subpages. It fetches the visible text from web pages and stores it in a structured format for easy access and analysis. This library is particularly useful for NLP and AI developers who need to gather large amounts of web content for their projects.
imuhammadaasim/bike_sales_data_analysis
The Bikes Sales Analysis Excel Project is a practical exploration of sales data analysis using Microsoft Excel. This project showcases how Excel can be a powerful tool for data cleaning, preprocessing, visualization, and dashboard creation, all within a familiar spreadsheet environment.
MadhuBala11/DiabetesPrediction
In this project, I have used logistic regression, a supervised machine learning algorithm, to predict whether a person has diabetes or not based on various features such as age, blood pressure, glucose level, body mass index, etc. I have used Python and popular libraries such as Pandas, Scikit-Learn, and Matplotlib to perfom model building
mahmudie/GDP_Analysis
India GDP Analysis using Python
NAVEENDATAANALYST/CUSTOMER-ANALYTICS-ON-USA-BASED-COMPANY-DATA
This is my 6th semester Essentials of Data Analytics project.
NAVEENDATAANALYST/HOTEL-RESERVATIONS-PREDICTION-IN-R
CAN U PREDICT CORRECTLY WHETHER A CUSTOMER WILL CANCEL THE RESERVATION?? You can find the dataset from this kaggle website: https://www.kaggle.com/datasets/ahsan81/hotel-reservations-classification-dataset
NAVEENDATAANALYST/SPACESHIP-TITANIC-PASSENGER-TRANSPORT-PREDICTION
The data is available in kaggle competitions. https://www.kaggle.com/competitions/spaceship-titanic I have participated and completed the competition on my own.
prakhargurawa/Titanic-Survival-Predictor
Trying to predict survival rate of passengers using algorithms like Logistic Regression, Ada Boost, Gradient Boost , Decision Tree Classifiers , Extra Tree Classifiers , Random Forest Classifiers and XG Boost with appropriate data preprocessing techniques.
rainaa0277/House-Price-Prediction-using-Linear-Regression
For a real estate firm, building a house price prediction model based upon various factors. Problem - Regression | Algorithm used -Linear Regression using OLS
rrambhia22/Crimes_Incarceration_Analysis
Crime and Incarceration in the United States contain data on crimes that are committed, and the prisoner counts in every 50 states, for which the data is analyzed using various analytical methods.
Zeina-Y/Bike-Sales-Dashboard
A GoogleSheet-based dashboard for analyzing bike sales data, customer behavior, and demographic trends through interactive visualizations and data filtering.
AnjaliKumari021/Retail_Customer_Behavior_Analysis_using_SQL
Analysed Retail data to understand customer behavior, transaction pattern using SQL
Hammad112/CodeX-Energy-drink-Project
Excel Data Cleaning,Data Manipulation and Data Visualization
Hammad112/Hotel-Reservation-Data-Exploration
Welcome to the HotelReservationInsights repository! This project focuses on the Exploratory Data Analysis (EDA) of a Hotel Reservation Dataset and prepares the data for Machine Learning (ML) and Deep Learning algorithms.
SwethaJoseph/Credit-Risk-Assessment-EDA-Case-Study
Conducted an Exploratory Data Analysis (EDA) using Python to assess credit risk, identifying key factors that contribute to loan defaults and improving lending decisions
SwethaJoseph/Statistical-Stock-Performance-Analysis
Conducted a statistical analysis of Microsoft, Tesla, and Apple stock performance compared to the S&P 500, examining price trends, volatility, and correlations to derive investment insights.
SwethaJoseph/Superstore-Sales-Overview-Tableau-Project
An interactive Tableau dashboard analyzing Superstore's sales and profitability trends across regions, products, and customer segments to uncover actionable key business insights and growth opportunities.
Armeldt/Python-WineShop-Sales-Analysis
Data reconciliation and sales analysis for a prestigious wine retailer using Python to assess product performance and pricing accuracy
mehadihn/Data-Preparation-Techniques-Project
This project was completed for the data preparation techniques course.
Rishikesh-Jadhav/VertexAI-Loan-Risk-Prediction
This project implements Google Cloud's Vertex AI to develop a machine learning model that predicts loan repayment risks using a tabular dataset. It encompasses data preparation, model training, evaluation, deployment, and prediction processes.
VaishakhMenon/Data-Analysis
Data analysis learning