/PySpark_tutorial

Examples of pyspark

Primary LanguagePython

#Exercises of pyspark

data contains the test data

word_count is a basic example to understand spark

nc -lk 9999 to simulate words stream with socket

top_n refers to here and is used to get the top N numbers.

median is used to get the median of a batch of data

inverted_index is used to create inverted index according to input data

count_once is to find an element that occurs only once from a list whose elements occur 2 times except the one to find

spark_streaming contains examples about socket streaming and kafka streaming.

spark_structured_streaming contains examples about socket streaming.