Using taobao dataset from double eleven to analyse and predic user behavior
-
Get basic overview of this festival,such as the toal volume of the transaction, the proportion of the buyer from different age, gender, and trend compared to last year.
-
Analyse the users' behaviors and tell the relationship between these behaviors and the final BUY hehavior. In other word, what kinds of behavior will bring to the buy behavior.
-
We predic whether the buyer will buy stuff from taobao or not.
-
In the end, all of the outcome from above will be visualized.
- Get the dataset, preprocess it and load it into HDFS
- Use Hive to further process the dataset
- Using Spark to predic returned customer.
- Visulization, plan to use JavaWeb.
- Linux
- Hadoop
- MySQL
- Sqoop
- Hive
- Spark
- Java 1.8
- Python3