ellahu1003
Hello! My name is Ella, I am based in London and I have just completed my Data Science Bootcamp at Imperial College London.
London, United Kingdom
Pinned Repositories
Ames-Housing-Price-Predictions-Project
This project focuses on predicting residential property prices in Ames, Iowa using multiple linear regression models. The workflow includes data cleaning, exploratory data analysis (EDA), and model evaluation to optimise real estate sales strategies. The analysis highlights the relationships between property features.
Data-Preprocessing-Cleaning
Data-Visualisation
Decesion-Trees-Titanic-Project
This project involves building a Decision Tree classifier to predict Titanic passenger survival. The analysis includes training models with varying depth restrictions, visualising the decision trees, and evaluating their accuracy on different data sets.
ellahu1003
Github profile README
Kmeans-Country-Data-Project
This project uses K-means clustering to group countries based on socio-economic and health factors such as child mortality, exports, health spending, income, and GDP per capita. The workflow includes data preprocessing, clustering, and performance evaluation using silhouette scores.
Sentiment-Analysis-Project
This project focuses on sentiment analysis of the NLTK movie reviews dataset using machine learning. I implemented and compared Multinomial Naive Bayes and Logistic Regression models after preprocessing text data with TF-IDF vectorisation. The goal was to classify reviews as positive or negative.
SQL-Project-Electronics-and-Appliances-Business
This project involves creating and managing a database for an electronics and appliances business using SQL. It includes the creation of three key tables: "Customers", 'Orders", and "Products", followed by data insertion and the execution of SQL queries for data retrieval, price adjustments, revenue analysis, and product management.
Tableau-Inc-5000-Analysis
This project uses Tableau to analyse the 2014 Inc. 5000 dataset, focusing on company growth, industry trends, geographical insights, and workforce impacts. The analysis includes visualisations such as bar graphs and heatmaps to convey key insights.
Text-Classification-Project
This project focuses on text classification using the 20 Newsgroups dataset, which contains approximately 20,000 documents across 20 different categories. A Multinomial Naive Bayes model was implemented to classify the documents, following a structured workflow that included data cleaning, TF-IDF vectorisation, and model evaluation.
ellahu1003's Repositories
ellahu1003/Ames-Housing-Price-Predictions-Project
This project focuses on predicting residential property prices in Ames, Iowa using multiple linear regression models. The workflow includes data cleaning, exploratory data analysis (EDA), and model evaluation to optimise real estate sales strategies. The analysis highlights the relationships between property features.
ellahu1003/Data-Preprocessing-Cleaning
ellahu1003/Data-Visualisation
ellahu1003/Decesion-Trees-Titanic-Project
This project involves building a Decision Tree classifier to predict Titanic passenger survival. The analysis includes training models with varying depth restrictions, visualising the decision trees, and evaluating their accuracy on different data sets.
ellahu1003/ellahu1003
Github profile README
ellahu1003/Kmeans-Country-Data-Project
This project uses K-means clustering to group countries based on socio-economic and health factors such as child mortality, exports, health spending, income, and GDP per capita. The workflow includes data preprocessing, clustering, and performance evaluation using silhouette scores.
ellahu1003/Sentiment-Analysis-Project
This project focuses on sentiment analysis of the NLTK movie reviews dataset using machine learning. I implemented and compared Multinomial Naive Bayes and Logistic Regression models after preprocessing text data with TF-IDF vectorisation. The goal was to classify reviews as positive or negative.
ellahu1003/SQL-Project-Electronics-and-Appliances-Business
This project involves creating and managing a database for an electronics and appliances business using SQL. It includes the creation of three key tables: "Customers", 'Orders", and "Products", followed by data insertion and the execution of SQL queries for data retrieval, price adjustments, revenue analysis, and product management.
ellahu1003/Tableau-Inc-5000-Analysis
This project uses Tableau to analyse the 2014 Inc. 5000 dataset, focusing on company growth, industry trends, geographical insights, and workforce impacts. The analysis includes visualisations such as bar graphs and heatmaps to convey key insights.
ellahu1003/Text-Classification-Project
This project focuses on text classification using the 20 Newsgroups dataset, which contains approximately 20,000 documents across 20 different categories. A Multinomial Naive Bayes model was implemented to classify the documents, following a structured workflow that included data cleaning, TF-IDF vectorisation, and model evaluation.
ellahu1003/Iris-Logistic-Regression-Project
This project applies logistic regression to the Iris dataset to classify iris species based on sepal and petal measurements. The workflow includes data normalisation, model training, and performance evaluation using a confusion matrix and standard classification metrics.
ellahu1003/Lists-and-Dictionaries
ellahu1003/SQL
ellahu1003/Task-Manager-Project
This project is a command-line application in Python designed to manage user authentication, task tracking, and administrative functions. It enables members of a small business to efficiently view, manage, and assign tasks.