/Machine-Learning-with-Spark-Streaming

Machine Learning with PySpark for Big Data Project

Primary LanguagePython

Machine-Learning-with-Spark-Streaming

Team BD_246_409_453_574 @Lohith5292 @sachinsachims @ShankYadav @VishnuJG


The following contains the graphs for different batch-sizes used for each model created

  • The first model i.e., Bernoulli Navie Bayes model accuracy graph

  • The second model i.e., Linear SGD Classification model accuracy graph

  • The third and last classifiaction model i.e., Multinomial Naive Bayes model accuracy graph

  • The KMeans Clustering model accuracy graph

  • Comparing the accuracy of all these models together we get :

We can see that even though the Linear SGD model dips in accuracy in the beginning, it performs well when the batch-size is increased.


A similar graph for errors are also plotted together for better understanding the different models