/spark-java-mongo-demo

hadoop-on-mongo demo 迁移至 spark-on-hadoop-mongo 再迁移至 mongo-spark-connector

Primary LanguageJava

spark-java-mongo-demo

spark 通过 mongo-hadoop 分析 mongodb数据 java写成

mongodb 原始数据 { "_id" : ObjectId("54d83f3548c9bc218e056ce6"), "apMac" : "aa:bb:cc:dd:ee:ff", "proto" : "http", "url" : "extshort.weixin.qq.com", "clientMac" : "ff:ee:dd:cc:bb:aa" }

输出结果

mvn clean scala:compile compile package

执行方式 spark-submit --class "sparkfisrttest.cdpspark.App" --packages org.mongodb.mongo-hadoop:mongo-hadoop-core:1.3.1,org.mongodb:mongodb-driver:3.0.1,org.mongodb:mongo-java-driver:3.0.1 ~/hadoop-spark-mongo-examples.jar

demo早期基于hadoop 的 mongo driver

mongo-hadoop-core

近期发现有spark官方的connector

https://docs.mongodb.com/spark-connector/current/

便加了基于mongo-spark-connector的rdd示例(dataset和sql尚不熟悉)