Distributed computing and Machine Learning
Folder | Description |
---|---|
spark_understanding | Spark API usage for distributed computing [Incremental Contribution] |
spark_project_regression | Regression using MLib of PySpark [Incremental Addition of ML] |
Follow this page to grab some future tutorials on Data Analysis, next tutorial would be
- Decision Tree Regressor
- Classification https://spark.apache.org/docs/latest/ml-classification-regression.html#classification
- Clustering [K-means] https://spark.apache.org/docs/latest/ml-clustering.html
- Random Forest https://spark.apache.org/docs/latest/ml-classification-regression.html#random-forest-classifier
- Principal component analysis (PCA)
- Singular value decomposition (SVD)
- Frequent Pattern Mining