Repository containing all my projects.
-
- Random Forests
- Blue Book for Bulldozers: In this project, we will go through a journey of making a highly efficient Random Forest model for Kaggle's competition - Blue Book For Bulldozers(https://www.kaggle.com/c/bluebook-for-bulldozers/overview) Here, we will go through each step from data preparation and hyperparameter tuning to data interpretation and analysis in a detailed manner.
- Deep Learning
- Face Expression Classifier: In this Jupyter notebook, we will train a convulational neural network(CNN) on the data gathered from face-expression-recognition-dataset. This dataset contains images of faces with variety of expression with labels such as angry, disgust, fear, happy, neutral, sad and surprise. Here I will be using fast.ai and PyTorch libraries to train and tune my CNN with resnet50 architecture. (For interactive experience with the Jupyter notebook, download it from kaggle link and follow the instructions)
- Random Forests
-
-
Python
- Cyclability Analysis(SQL): Calculation of Cyclability Score(Cyclability is the extend to which an area is cyclable) of different neighbourhoods in Sydney by gathering and integrating several datasets to perform data analysis. Spatially joining the boundries of neighbourhoods and bike-sharing pods is used for the calculation.
-
R
-
Crash Severity and Frequency: This report aims to determine whether certain factors affect the frequency or severity of car crashes and draw conclusions about what the Australian government should do to limit these. By plotting graphs and doing the relevant statistical analysis, we have attempted to show strong relationships between drug usage, license type, and weather conditions, with the severity and frequency of accidents respectively.
-
University Life and Stress: The aim of this report is to determine what factors of university life correlate the most with stress, to find out whether the time spent on each activity has a linear relationship with stress, and what the best way to reduce stress is. After designing a survey filled out by university students, we have attempted to analyze the responses with relevant graphs and show a correlation or relationship that can answer our research questions.
-
International and Domestic Students: The aim of this report is to determine whether there is a significant difference between the grades of International and Domestic students, using hypothesis testing. If there was indeed a difference, then we would attempt to establish a possible cause for this, by looking to see if their frequency of using Canvas had any effect on their grades, using a scatterplot.
-
CensusAtSchool New Zealand: This reports aims to test many arguments on population set based on a diverse yet rich sample dataset downloaded from CensusAtSchool(NZ). The dataset contains many features of students from year 4 to year 13 including age, gender, opinions on drugs, ethnicity, cell phone habits, etc. A variety of tests including Chi-squared goodness of fit test, test for homogenity, test for independence, fisher test and monte-carlo simulation has been used to answer relevant questions. Graphical and numerical summaries has been used appropriately.
-
DATA2X02 Students Analysis: This report aims to determine significant difference between many variables about the data collected from the students of DATA2X02 2019 class at the University of Sydney. The data was collected using an online survey, which included many kind of questions resulting in various types of varaibles for analysis. A variety of tests including Wilcoxon sign test, Wilcoxon rank-sum test, Permutation test and Hotelling's T-test have been used. Graphical and numerical summaries has been used thoroughly to analyse the assumptions for each of the hypothesis test. Clean R-markdown theme with quality plots were also added to make the report visually appealing.
-
-
-
- Java
-
Atomination: This game revolves around placing atoms on a grid-based game board. Each grid space has a limit of how many atoms it can contain. Once the limit is reached, the atoms will expand to the adjacent grid spaces. This simple rule exhibits an interesting property where chains can be triggered by a single placement. This acts as a mechanism to capture other grid spaces from your opponents. The game will be accompanied by a number of utility functions for players to utilise such as saving the game, game statistics and loading games. You will also need to implement a set of commands that will allow the players to interact with the game.
-
AeroDB: AeroDB is a key-value based database completely in the Java programming language using dynamic data structures. All entries to the database contain a unique key which will map to a set of values. Each entry of the database is identified by a unique key string and contains a dynamically sized list of integer values.
-
- Java
Please contact me at nama1arpit@gmail.com or anam9745@uni.sydney.edu.au