/Algorithms-for-Big-Data

Algorithms for Big Data.

Primary LanguageJupyter NotebookGNU General Public License v3.0GPL-3.0

Algorithms for Big Data

This repository consists of implementations of various algorithms used for processing big data.
The algorithms used are:

  1. Simple and Multiple Linear Regression using Gradient Descent (batch data)
  2. Simple and Multiple Linear Regression using Gradient Descent (stream data)
  3. Simple and Multiple Linear Regression using Normal Equation Methods (batch data)
  4. Incremental Mathematical Stream Regression and Approximate Stream Regression
  5. Collaborative Filtering (Stochastic Gradient Descent for Matrix Factorization)
  6. Collaborative Filtering (Distributed Stochastic Gradient Descent for Matrix Factorization)
  7. Collaborative Filtering (Streaming Distributed Stochastic Gradient Descent for Matrix Factorization)
  8. k-means and k-medoids (batch data)
  9. Stream Algorithm
  10. Clustream Algorithm
  11. ID3 and CART (batch data)
  12. Hoeffding Tree

Note: Data folder consists of datasets used in these algorithms.