alecngai
Aspiring Data Scientist, with a strong physics, math, and logic background.
NSUS group inc. Toronto, Ontario
Pinned Repositories
01-Kickstarter-analysis
analyzing kickstarter fundraising data (Kickstarter_Challenge.xlsx) to provide clarity on two main issues, outcomes based on launch date, and Outcomes based on Goals.
02-Stock_Analysis
03-Election_Analysis
04-School_District_Analysis
05_Pybar_Analysis
07_Pwelett_Hackard_Analysis
We will complete two technical analysis using SQL and PGAdmin4, using the data proivded to us we will achieve two results: The Number of Retiring Employees by Title The Employees Eligible for the Mentorship Program
08_Movies_ETL
tasked with creating datasets for a hackingathon, the client "Amazing Prime" wants a dataset that is always up to date and is automated in such a way that we extract, transform and load the dataset into a clean workable dataset. We do this in four tasks: write an ETL function to read three data files, extract and transform the Wikipedia data, extract and transform the Kaggle and rating data, load the data to a PostgreSQL Movie Database.
14_Bikesharing
Using Tableau we are able to analyze the cities bike sharing platform in many aspects.
17_Credit_Risk_Analysis
Credit risk is very tough to predict. In this project we want to take a look at how all the factors in our loan_stats csv help predict whether someone is low or high risk status. One method that data scientists use for this type of issue is creating a model and then evaluate and train the models that they create. In this specific project we are using imbalanced-learn and scikit-learn libraries to build models and evalute them using a resampling method. In the first couple of models I oversampled the data using randomoversampler and smote algorithms and undersample the data with the clustercentroid algorithm. In the remaining models I used a combination approach to over and undersample the data using smoteenn. Finally, I compared two machine learning models that minimize bias, balancedrandomforestclassifier and easyensembleclassifier.
Predicting_Canadian_Job_Vacancies
Understand the Canadian labor market and predict how the job market would look like in the near future. It is important to know how the labour market have changed over the past years and how it will look like in the upcoimg years. Understanding the labour market and looking at potential vacancies could go a long way in avoiding situations such as economic collapse and could also potentially assist with education planning.
alecngai's Repositories
alecngai/Predicting_Canadian_Job_Vacancies
Understand the Canadian labor market and predict how the job market would look like in the near future. It is important to know how the labour market have changed over the past years and how it will look like in the upcoimg years. Understanding the labour market and looking at potential vacancies could go a long way in avoiding situations such as economic collapse and could also potentially assist with education planning.
alecngai/01-Kickstarter-analysis
analyzing kickstarter fundraising data (Kickstarter_Challenge.xlsx) to provide clarity on two main issues, outcomes based on launch date, and Outcomes based on Goals.
alecngai/02-Stock_Analysis
alecngai/03-Election_Analysis
alecngai/04-School_District_Analysis
alecngai/05_Pybar_Analysis
alecngai/07_Pwelett_Hackard_Analysis
We will complete two technical analysis using SQL and PGAdmin4, using the data proivded to us we will achieve two results: The Number of Retiring Employees by Title The Employees Eligible for the Mentorship Program
alecngai/08_Movies_ETL
tasked with creating datasets for a hackingathon, the client "Amazing Prime" wants a dataset that is always up to date and is automated in such a way that we extract, transform and load the dataset into a clean workable dataset. We do this in four tasks: write an ETL function to read three data files, extract and transform the Wikipedia data, extract and transform the Kaggle and rating data, load the data to a PostgreSQL Movie Database.
alecngai/14_Bikesharing
Using Tableau we are able to analyze the cities bike sharing platform in many aspects.
alecngai/17_Credit_Risk_Analysis
Credit risk is very tough to predict. In this project we want to take a look at how all the factors in our loan_stats csv help predict whether someone is low or high risk status. One method that data scientists use for this type of issue is creating a model and then evaluate and train the models that they create. In this specific project we are using imbalanced-learn and scikit-learn libraries to build models and evalute them using a resampling method. In the first couple of models I oversampled the data using randomoversampler and smote algorithms and undersample the data with the clustercentroid algorithm. In the remaining models I used a combination approach to over and undersample the data using smoteenn. Finally, I compared two machine learning models that minimize bias, balancedrandomforestclassifier and easyensembleclassifier.
alecngai/06_World_Weather_Analysis
alecngai/09_Surfs_up
alecngai/10_Mission_to_Mars
alecngai/11_UFOs
alecngai/12_Plotly_Charts
alecngai/13_Mapping_Earthquakes
alecngai/15_MechaCar_Statistical_Analysis
alecngai/18_Cryptocurrencies
alecngai/19_Churn_Prediction
alecngai/alecngai
Config files for my GitHub profile.
alecngai/AutoEq
Automatic headphone equalization from frequency responses
alecngai/NSUS_BI_Assessment
alecngai/Obsidian_Notebook
alecngai/Portfolio