Pinned Repositories
big_data_for_chimps
A Seriously Fun guide to Big Data Analytics in Practice
datasets
Datasets that I generally use for trainings, workshops
Datasets-1
Machine learning datasets used in tutorials on MachineLearningMastery.com
Flight_delay_prediction_web_app
A big data web application to predict USA airline traffic delay with Python, Flask, Apache Spark, Kafka, MongoDB, ElasticSearch, d3.js, scikit-learn, MLlib and Apache Airflow.
free-programming-books
:books: Freely available programming books
lc
A list of 160+ leetcode questions grouped by their common patterns
nlp-datasets
Alphabetical list of free/public domain datasets with text data for use in Natural Language Processing (NLP)
SparkInternals
Notes talking about the design and implementation of Apache Spark
standards
Standards and guidelines at Gilt
SynapsePySparkWordCount
Create Spark Job Defination
Lambda ML's Repositories
LambdaML/kafka_exporter
Kafka exporter for Prometheus
LambdaML/ammonite-term-repl
Scala Ammonite REPL in Emacs term mode.
LambdaML/api-development-tools
:books: A collection of useful resources for building RESTful HTTP+JSON APIs.
LambdaML/apparate
Make your libraries magically appear in Databricks.
LambdaML/auto
A collection of source code generators for Java.
LambdaML/aws-ec2-ssh
Manage AWS EC2 SSH access with IAM
LambdaML/cmder
Lovely console emulator package for Windows
LambdaML/codecombat
Game for learning how to code.
LambdaML/darnassus
Prototype Replication and Streaming Platform
LambdaML/data-pipeline-samples
This repository hosts sample pipelines
LambdaML/dev-setup
macOS development environment setup: Easy-to-understand instructions with automated setup scripts for developer tools like Vim, Sublime Text, Bash, iTerm, Python data analysis, Spark, Hadoop MapReduce, AWS, Heroku, JavaScript web development, Android development, common data stores, and dev-based OS X defaults.
LambdaML/dockbix-xxl
:whale: Dockerized Zabbix - server, web, proxy, java gateway, snmpd with additional extensions
LambdaML/docker-machine-parallels
Parallels driver for Docker Machine https://github.com/docker/machine
LambdaML/docs
Rapid CloudFormation: Modular, production ready, open source.
LambdaML/dr-elephant
Dr. Elephant is a job and flow-level performance monitoring and tuning tool for Apache Hadoop and Apache Spark
LambdaML/eventuate-examples-java-customers-and-orders
Java version of the Customers and Orders event sourcing example from my presentations
LambdaML/google-java-format-gradle-plugin
LambdaML/graphite_exporter
Server that accepts metrics via the Graphite protocol and exports them as Prometheus metrics
LambdaML/hydra-spark
LambdaML/IDEA-Native-Terminal-Plugin
Native Terminal Plugin for IntelliJ IDEs
LambdaML/kafka-monitor
Monitor the availability of Kafka clusters with generated messages.
LambdaML/learn-cloudformation
Learn how to use Infrastructure as Code on AWS with the help of CloudFormation.
LambdaML/learn-fargate
Labs helping you to learn AWS Fargate within a few hours.
LambdaML/learn-scala
LambdaML/learning-spark
Example code from Learning Spark book
LambdaML/localstack
💻 A fully functional local AWS cloud stack. Develop and test your cloud & Serverless apps offline!
LambdaML/ScreenToGif
🎬 ScreenToGif allows you to record a selected area of your screen, edit and save it as a gif or video.
LambdaML/spark-testing-base
Base classes to use when writing tests with Spark
LambdaML/terraform-aws-airflow
Terraform module to deploy an Apache Airflow cluster on AWS, backed by RDS PostgreSQL for metadata, S3 for logs and SQS as message broker with CeleryExecutor
LambdaML/the-monitor
Markdown files for Datadog's long-form blog posts: https://www.datadoghq.com/blog/