Rcourse
Rcourse_vol1
Course for Big Data seminar.
Contents:
- Introduction to syntax
- Vectors, matrices, data.frames, lists
- Graphics
- Regression analysis
- Time series analysis
- Cluster analysis
- BONUS - data stream clustering
Rcourse_vol2
Materials for Big Data seminar and KDD course.
Contents:
- Introduction to syntax,
- Visualisations (basic, ggplot2, plotly, dygraph, animations),
- Time series classification and cluster analysis.
Cluster Analysis
Presentation for Big Data seminar.
Contents:
- What is clustering?
- Types of clustering methods
- Centroid-based
- Model-based
- Density-based
- Spectral clustering
- Hierarchical clustering
KDD excercises
Contents:
- Classification of Iris dataset with Naive Bayes method,
- Feature engineering and exploratory analysis with ozone and genes leukemia datasets (description of TASKS),
- Typical workflow for linear regression task with ozone dataset,
- Linear regression on Boston housing dataset (description of TASKS),
- Typical workflow for logistic regression (classification task) with Titanic dataset (download dataset from here),
- Dimensionality reduction - PCA step by step explained, Multidimensional scaling, t-SNE with pendigits dataset (description of dataset).
More info
For more information and tutorials please visit my website petolau.github.io.