kohjiaxuan
Data Scientist. Enjoys doing data science projects in my free time (update: no longer true because doing masters XD)
Singapore
Pinned Repositories
100-pandas-puzzles
100 data puzzles for pandas, ranging from short and simple to super tricky (60% complete)
Data-Science-Competition-for-Revenue-Maximization
Data Science Competition that challenged teams to come up with creative ways to increase the revenue of an e-commerce company. Won 1st place! Write-up in repository
Fraud-Detection-Pipeline
A structured data science pipeline for classification problems that does scaling, sampling, k-fold cross validation with evaluation metrics
lstm_basics
basics of lstm and autoencoder
Machine-Learning-with-sklearn
Practice Juypter Notebooks for my machine learning journey with Python. Please refer to other repositories for completed projects!
NLP-Model-for-Corpus-Similarity
A NLP algorithm I developed to determine the similarity or relation between two documents/Wikipedia articles. Inspired by the cosine similarity algorithm and built from WordNet.
Predicting-HDB-Price-with-Machine-Learning
Data Project of Predicting HDB Resale Flat Prices with data cleaning, feature engineering and machine learning. Models used: Random Forest, XGBoost, Neural Networks, Decision Tree, Support Vector Regressors, Linear Regression
Stock-Market-Dashboard
Creating a stock market dashboard from an external API that tracks daily performance of stocks
Visualisation-of-Gradient-Descent
By visualizing the gradient descent algorithm applied on a set of points that fits a quadratic equation, we understand better how the algorithm works in machine learning
Wikipedia-Article-Scraper
A complete Python text analytics package that allows users to search for a Wikipedia article, scrape it, conduct basic text analytics and integrate it to a data pipeline without writing excessive code.
kohjiaxuan's Repositories
kohjiaxuan/Wikipedia-Article-Scraper
A complete Python text analytics package that allows users to search for a Wikipedia article, scrape it, conduct basic text analytics and integrate it to a data pipeline without writing excessive code.
kohjiaxuan/Stock-Market-Dashboard
Creating a stock market dashboard from an external API that tracks daily performance of stocks
kohjiaxuan/NLP-Model-for-Corpus-Similarity
A NLP algorithm I developed to determine the similarity or relation between two documents/Wikipedia articles. Inspired by the cosine similarity algorithm and built from WordNet.
kohjiaxuan/Predicting-HDB-Price-with-Machine-Learning
Data Project of Predicting HDB Resale Flat Prices with data cleaning, feature engineering and machine learning. Models used: Random Forest, XGBoost, Neural Networks, Decision Tree, Support Vector Regressors, Linear Regression
kohjiaxuan/Visualisation-of-Gradient-Descent
By visualizing the gradient descent algorithm applied on a set of points that fits a quadratic equation, we understand better how the algorithm works in machine learning
kohjiaxuan/Data-Science-Competition-for-Revenue-Maximization
Data Science Competition that challenged teams to come up with creative ways to increase the revenue of an e-commerce company. Won 1st place! Write-up in repository
kohjiaxuan/Fraud-Detection-Pipeline
A structured data science pipeline for classification problems that does scaling, sampling, k-fold cross validation with evaluation metrics
kohjiaxuan/100-pandas-puzzles
100 data puzzles for pandas, ranging from short and simple to super tricky (60% complete)
kohjiaxuan/lstm_basics
basics of lstm and autoencoder
kohjiaxuan/Machine-Learning-with-sklearn
Practice Juypter Notebooks for my machine learning journey with Python. Please refer to other repositories for completed projects!