ahujaya
Data Analyst @ Resolution Life, Sydney Toolkit: Python | Snowflake SQL | BigQuery | PowerBI | Azure | GCP | AWS | R | Tableau | MS-Excel/Google Sheets
Deakin University, MelbourneSydney, Australia
Pinned Repositories
ahujaya
Config files for my GitHub profile.
Analyze-AB-Test-Results-Python
A/B tests are very commonly performed by data analysts and data scientists. It is important that you get some practice working with the difficulties of these. For this project, you will be working to understand the results of an A/B test run by an e-commerce website. Your goal is to work through this notebook to help the company understand if they should implement the new page, keep the old page, or perhaps run the experiment longer to make their decision.
Classification-Model-for-Airbnb-AI-RapidMiner
Gained insights into the New York City Airbnb rental properties and concluded the neighbourhoods with most attractive Airbnb rentals and the type of rental properties with most reviews. Furthermore, concluded the economic viability of the rentals with missing reviews through machine learning models such as k-NN, decision tree and gradient boosted tree (GBT) classifiers implemented via data science platform RapidMiner.
Communicate-Data-Findings-Python
Bay Wheels is a public bicycle sharing system in the San Francisco Bay Area, California. I'm most interested in exploring the bike trips' duration and rental events occurrance patterns in terms of time of day, day of the week, along with how these relate to the riders' characteristics, i.e. their user type, gender, age, etc. to get a sense of what and how people are using the bike sharing service for.
Customer-Churn-Prediction-for-TESCO-UK-Supermarket-Chain-MS-Excel
Built a model to predict customer churn using logistic regression in MS-Excel and evaluated the performance of the constructed model on the holdout sample provided (using metrics related to confusion matrix) • Evaluated the performance of the constructed model against the traditional models such as RFM and Random models (using lift analysis )
Energy-Prediction-of-Domestic-Appliances-Dataset-R
The given dataset, "Energy20.txt", can be used to create models of energy use of appliances in a energy-efficient house. The dataset provides the Energy use of appliances (denoted as Y) using 671 samples. It is a modified version of data used in the study [1]. The dataset includes 5 variables, denoted as X1, X2, X3, X4, X5, and Y, described as follows: X1: Temperature in kitchen area, in Celsius X2: Humidity in kitchen area, given as a percentage X3: Temperature outside (from weather station), in Celsius X4: Humidity outside (from weather station), given as a percentage X5: Visibility (from weather station), in km Y: Energy use of appliances, in Wh
European-Football-Analysis-Python
The analysis is performed on the soccer database which comes from Kaggle. It contains data for soccer matches, players, and teams from several European countries from 2008 to 2016. The analysis covers top home and away teams, overall top teams, players with most penalties etc.
Web-Logs-Exploratory-Data-Analysis-and-Web-Crawling-Python
Web Logs Exploratory Data Analysis & Web Crawling of citation information from Google Scholar
Web-Logs-Unsupervised-and-Supervised-Machine-Learning-Association-Rule-Mining-ARIMA-Prediction
Web Logs Data Unsupervised, Supervised Learning, Association Rule Mining & ARIMA Prediction. Web Crawling of citation information from Google Scholar
Wrangle-and-Analyze-Twitter-Data-Python
The dataset that I will be wrangling, analyzing and visualizing is the tweet archive of Twitter user @dog_rates, also known as WeRateDogs. WeRateDogs is a Twitter account that rates people's dogs with a humorous comment about the dog. These ratings almost always have a denominator of 10. The numerators, though? Almost always greater than 10. 11/10, 12/10, 13/10, etc. Why? Because "they're good dogs Brent." WeRateDogs has over 4 million followers and has received international media coverage.
ahujaya's Repositories
ahujaya/Analyze-AB-Test-Results-Python
A/B tests are very commonly performed by data analysts and data scientists. It is important that you get some practice working with the difficulties of these. For this project, you will be working to understand the results of an A/B test run by an e-commerce website. Your goal is to work through this notebook to help the company understand if they should implement the new page, keep the old page, or perhaps run the experiment longer to make their decision.
ahujaya/Classification-Model-for-Airbnb-AI-RapidMiner
Gained insights into the New York City Airbnb rental properties and concluded the neighbourhoods with most attractive Airbnb rentals and the type of rental properties with most reviews. Furthermore, concluded the economic viability of the rentals with missing reviews through machine learning models such as k-NN, decision tree and gradient boosted tree (GBT) classifiers implemented via data science platform RapidMiner.
ahujaya/Energy-Prediction-of-Domestic-Appliances-Dataset-R
The given dataset, "Energy20.txt", can be used to create models of energy use of appliances in a energy-efficient house. The dataset provides the Energy use of appliances (denoted as Y) using 671 samples. It is a modified version of data used in the study [1]. The dataset includes 5 variables, denoted as X1, X2, X3, X4, X5, and Y, described as follows: X1: Temperature in kitchen area, in Celsius X2: Humidity in kitchen area, given as a percentage X3: Temperature outside (from weather station), in Celsius X4: Humidity outside (from weather station), given as a percentage X5: Visibility (from weather station), in km Y: Energy use of appliances, in Wh
ahujaya/Wrangle-and-Analyze-Twitter-Data-Python
The dataset that I will be wrangling, analyzing and visualizing is the tweet archive of Twitter user @dog_rates, also known as WeRateDogs. WeRateDogs is a Twitter account that rates people's dogs with a humorous comment about the dog. These ratings almost always have a denominator of 10. The numerators, though? Almost always greater than 10. 11/10, 12/10, 13/10, etc. Why? Because "they're good dogs Brent." WeRateDogs has over 4 million followers and has received international media coverage.
ahujaya/ahujaya
Config files for my GitHub profile.
ahujaya/Communicate-Data-Findings-Python
Bay Wheels is a public bicycle sharing system in the San Francisco Bay Area, California. I'm most interested in exploring the bike trips' duration and rental events occurrance patterns in terms of time of day, day of the week, along with how these relate to the riders' characteristics, i.e. their user type, gender, age, etc. to get a sense of what and how people are using the bike sharing service for.
ahujaya/Customer-Churn-Prediction-for-TESCO-UK-Supermarket-Chain-MS-Excel
Built a model to predict customer churn using logistic regression in MS-Excel and evaluated the performance of the constructed model on the holdout sample provided (using metrics related to confusion matrix) • Evaluated the performance of the constructed model against the traditional models such as RFM and Random models (using lift analysis )
ahujaya/European-Football-Analysis-Python
The analysis is performed on the soccer database which comes from Kaggle. It contains data for soccer matches, players, and teams from several European countries from 2008 to 2016. The analysis covers top home and away teams, overall top teams, players with most penalties etc.
ahujaya/Web-Logs-Exploratory-Data-Analysis-and-Web-Crawling-Python
Web Logs Exploratory Data Analysis & Web Crawling of citation information from Google Scholar
ahujaya/Web-Logs-Unsupervised-and-Supervised-Machine-Learning-Association-Rule-Mining-ARIMA-Prediction
Web Logs Data Unsupervised, Supervised Learning, Association Rule Mining & ARIMA Prediction. Web Crawling of citation information from Google Scholar
ahujaya/Data-Modelling-for-TPM-a-paper-mill-company-MS-Excel
• Modelled the interaction between the variables such as paper quantity, quality and brand image and found whether the interaction was statistically significant applying hypothesis testing in MS-Excel • Built a model in Excel to predict the likelihood of customers signing a contract with TPM leveraging multiple linear regression algorithm • Generated a time-series model to forecast TPM's turnover for the next four quarters
ahujaya/Estimation-Model-for-Airbnb-AI-RapidMiner
Gained insights into the New York City Airbnb rental properties and discovered the trends in price and customer satisfaction level. Also discovered the kind of rentals receive what type of satisfaction level and predicted the likely satisfaction level of the new rentals leveraging advanced machine learning clustering algorithms such as k-means and estimation algorithms such as linear regression, decision tree and Gradient Boosted Trees.
ahujaya/github-slideshow
A robot powered training repository :robot:
ahujaya/Linear-Programming-and-Solving-Two-Player-Zero-Sum-Games-R
Linear Programming and Solving Two Player Zero Sum Games using R
ahujaya/sit742
SIT742: Modern Data Science