/big-data-project

Website about Big Data for CO163 Topics at Imperial College London

Primary LanguageCSS

This repository contains the results of a research project about Machine Learning and Big Data created for the CO163 Topics course at Imperial College London. My group and I achieved a final mark of 91.4% for the website, winning the Outstanding Group Project prize for its content and the quality/depth of our research.

View the site live here!

📰 Abstract

Big Data is one of the most-used buzzwords in technology today, and has been since the late 2000s. The tools and techniques used to process and analyse such large amounts of data are often not intended to do so, leading to weaknesses and inefficiences. This project addresses a few of the machine learning techniques that can be applied to large datasets. We assess their advantages, their flaws and potential improvements to the underlying algorithms to increase output quality and efficiency. There are also Python implementations of decision trees and an interactive demonstration of some of their flaws, written in JavaScript.

📖 Topics Covered

  • Dimensionality Reduction
  • Decision Trees
  • Random Forests
  • The ID3 Algorithm
  • Adaboost and Robustboost

🤝 Contributors

Rohan Pritchard, Rohan Padmanabhan, George Zhelev