sparkml
There are 78 repositories under sparkml topic.
salesforce/TransmogrifAI
TransmogrifAI (pronounced trăns-mŏgˈrə-fī) is an AutoML library for building modular, reusable, strongly typed machine learning workflows on Apache Spark with minimal hand-tuning
linzhouzhi/SparkML
spark 机器学习:利用jupyter工作来讲解算法原理并运行相关例子
vivek-bombatkar/MyLearningNotes
Because its never late to start taking notes and 'public' it...
aws/sagemaker-sparkml-serving-container
This code is used to build & run a Docker container for performing predictions against a Spark ML Pipeline.
alipay/jpmml-sparkml-lightgbm
JPMML-SparkML plugin for converting LightGBM-Spark models to PMML
hexnn/Stark
基于Spark+SparkMLlib+Debezium打造的简单易用、超高性能大数据治理引擎,适用于批流一体的数据集成和数据分析,支持机器学习算法模型、支持CDC实时数据采集,支持数据质量校验、数据建模、算法建模和OLAP数据分析
hhsecond/ml2rt
Machine learning utilities for model conversion, serialization, loading etc
sebsui/JavaRank
Recommendation engine in Java. Based on an ALS algorithm (Apache Spark). Train a new model after N seconds.
colbyford/sparkitecture
A collection of “cookbook-style” scripts for simplifying data engineering and machine learning in Apache Spark.
cheukhin1024/Financial-Data-Project-in-Azure
Free High-Quality Financial Data in Azure
daniel-acuna/pyspark_pipes
Helper functions for building complex Spark ML pipelines
chaokunyang/bigdata-examples
bigdata examples about spark and flink
Hamza88-coder/Real-Time-Recruitment-System-with-AI-and-Data-Analytics
Simulation of job offers and CVs with real-time processing, classification, and analytics using Kafka, Ray, Spark, and Databricks. Includes a Flask-based recommendation system and Tableau visualizations.
Subham2S/BigData-Engineering-Capstone-Project-1
BigData Engineering Capstone Project with Tech-stack : Linux, MySQL, sqoop, HDFS, Hive, Impala, SparkSQL, SparkML, git
chenliny-zz/Flight_Delay_Prediction
A machine learning at scale demo on flight delay prediction. The project includes an exploration of a series of data transformation and ML pipelines in Apache Spark (via Databricks).
jpacerqueira-zz/Akamai-log-Analysis-SparkML-H2o
Transformation of Akamai Logs with Spark ETL and discover of Values and similarities in logs used SparkML and H2O ML
mdh266/TwitterSentimentAnalysis
Twitter Sentiment Analysis using Spark, MongoDB, and Google Cloud
ozancicek/artan
Online latent state estimation with Spark
lijoabraham/spark-playground
Data analysis using apache spark
alivcor/node-red-contrib-sparkml
NodeRED Extension Pack for SparkML / Apache Spark
fediazgon/sparkml-flights-delay
Predicting the arrival delay time of commercial flights
rdolor/kaggle-house-price-regression
Repo for using scala in a kaggle house price prediction.
Pirata-Codex/Sentiment-Analysis-SparkML
Using SparkML to build different machine learning models for simulating a small scale of big data management
santiagxf/portable-sparkml
This repository shows how to create containerized versions of models trained with spark MLLib
anant1203/Malware-Classification
This repository contains classification of documents, to classify documents into one out of several possible malware families, using Google Cloud Platform, PySpark, Jupyter notebook. This project is done for CSCI8360: Data Science Practicum at The University of Georgia.
AndreasTraut/Machine-Learning-with-Python
Repository showing my machine-learning experiences with Python, SkLearn and Apache Spark. Providing templates to be used for standard ML problems as well for Big-Data ML problems.
andreiramani/Machine-Learning-with-Apache-Spark
Coursera IBM Data Engineering (Course 12 from 13)
Crone1/Spark-Recommender-System
This project involves using Pyspark to create a recommendation system on the Google Cloud Platform
gurug-dev/distributed_data_systems_project
Sentiment Analysis and SparkML modeling on Financial Data using HuggingFace, Spark, MongoDB, Airflow and GCS.
ph2017001/FuzzyMatch_Spark
FuzzyMatch a Query Set with a Reference Set Using Spark
pregismond/build-ml-pipeline-airfoil-noise-prediction
Build a Machine Learning Pipeline for Airfoil Noise Prediction
skamalj/machine-learning
This repository is collection of ipython notebooks implementing various ML algorithms in Spark and SystemML
SudhansuTaparia/BigData
This is a repository i have created to put up some of the knowledge i have gained around Big Data Technologies especially Spark, GraphX etc.
supersjgk/Marketing_Campaign_Analysis
A Data Science project for Marketing Campaign Analysis
tam-ng/BigData-Solution-Gaming-Platform
Big Data Solution for Gaming eCommerce Platform