An analysis for top users, top tags, peak active time, average length of a question for Udacity Discussion Forum data in Python using Hadoop and Spark.
- Technologies: Python, Apache Spark, Apache Hadoop
- Tools and OS: Atom, Ubuntu
- Platform: Hadoop Cluster
- Lines of Code: 93
- Duration: 2 Weeks (NOV 2016)
- Dataset: http://content.udacity-data.com/course/hadoop/forum_data.tar.gz