Pinned Repositories
aas
Code to accompany Advanced Analytics with Spark from O'Reilly Media
awesome-data-engineering
A curated list of data engineering tools for software developers
awesome-interview-questions
:octocat: A curated awesome list of lists of interview questions. Feel free to contribute! :mortar_board:
awesome-opensource-data-engineering
An Awesome List of Open-Source Data Engineering Projects
awesome-public-datasets
A topic-centric list of high-quality open datasets in public domains. Propose NEW data ☛☛☛PR☛☛☛
leetcode
LeetCode Solutions: A Record of My Problem Solving Journey.( leetcode题解,记录自己的leetcode解题之路。)
nelsonjiao's Repositories
nelsonjiao/go
The Open Source Data Science Masters
nelsonjiao/BigDL-Tutorials
Step-by-step Deep Leaning Tutorials on Apache Spark using BigDL
nelsonjiao/tensorflow-workshop
This repo contains materials for use in a TensorFlow workshop.
nelsonjiao/Toruk
The great leonopteryx (Na'vi name: toruk meaning "last shadow") is a species of airborne predatory animals native to Pandora. Scientifically, it is known as Leonopteryx rex – "flying king lion" (from the Greek word λέων leon meaning lion, πτέρυξ pteryx meaning wing, and the Latin word rex, meaning king). The fierce beauty and nobility of the leonopteryx gave the species a central place in Na'vi lore and culture. It is celebrated in dance, song, and with elaborate totems that symbolize both the fear and respect given to the creature. Indeed, the leonopteryx is crucial to the Na'vi's sense of destiny and interconnectedness. I like apache zeppelin, but it could have different Avatar
nelsonjiao/zeppelin-notebooks
Gallery of Apache Zeppelin notebooks
nelsonjiao/R-Programming---Swirl
Learning R Programming with Swirl
nelsonjiao/R_Kmeans
R_Kmeans
nelsonjiao/aas
Code to accompany Advanced Analytics with Spark from O'Reilly Media
nelsonjiao/openstack-manuals
OpenStack Manuals
nelsonjiao/spark-Jupyter-AWS
A guide on how to set up Jupyter with Pyspark painlessly on AWS EC2 clusters, with S3 I/O support
nelsonjiao/spark-two-migration
nelsonjiao/markdown-here
Google Chrome, Firefox, and Thunderbird extension that lets you write email in Markdown and render it before sending.
nelsonjiao/Head-First-Java
Code for Head First Java
nelsonjiao/webmagic
A scalable web crawler framework for Java.
nelsonjiao/spark-csv
CSV data source for Spark SQL and DataFrames
nelsonjiao/XX-Net
a web proxy tool
nelsonjiao/pyspider
A Powerful Spider(Web Crawler) System in Python.
nelsonjiao/zeppelin-d3-spell
D3.js Spell Visualization for Apache Zeppelin
nelsonjiao/csv2parquet
Create Parquet files from CSV
nelsonjiao/spark-ec2
Scripts used to setup a Spark cluster on EC2
nelsonjiao/Qix
Machine Learning、Deep Learning、PostgreSQL、Distributed System、Node.Js、Golang
nelsonjiao/spark-ml-source-analysis
spark ml 算法原理剖析以及具体的源码实现分析
nelsonjiao/cc-mrjob
Demonstration of using Python to process the Common Crawl dataset with the mrjob framework
nelsonjiao/dkpro-c4corpus
DKPro C4CorpusTools is a collection of tools for processing CommonCrawl corpus, including Creative Commons license detection, boilerplate removal, language detection, and near-duplicate removal.
nelsonjiao/common_crawl_index
Index URLs in Common Crawl
nelsonjiao/zeppelin-examples-1
Zeppelin notebook examples
nelsonjiao/zeppelin-examples
This project is for examples of how to use Zeppelin. https://github.com/apache/incubator-zeppelin
nelsonjiao/d3-spark-ajax
Example code for presentation on using D3.js and d3pie, with AJAX, to call a Spark Job Server and display the result
nelsonjiao/webscale_nlp
language detection and topic modeling on multi-terabyte common crawl corpus
nelsonjiao/cs-101
Intro to Computer Science