Big-Data-Project Machine Learning with Spark Streaming for Ham-Spam Detection (Enron email classification dataset) Batch wise processing using sklearn module and incremental learning. Built Gaussian Naïve Bayes, SGD classifier, Multinomial Naïve Bayes, MiniBatchKmeans and PassiveAggressive classifiers. Implemented joblib in order to save the trained model. Used “partial fit” to perform online updates to the model.