#Exercises of pyspark
data contains the test data
word_count is a basic example to understand spark
nc -lk 9999 to simulate words stream with socket
top_n refers to here and is used to get the top N numbers.
median is used to get the median of a batch of data
inverted_index is used to create inverted index according to input data
count_once is to find an element that occurs only once from a list whose elements occur 2 times except the one to find
spark_streaming contains examples about socket streaming and kafka streaming.
spark_structured_streaming contains examples about socket streaming.