-
Setup MongoDB Atlas Cluster
- Cluster name - mflix
- Project name - anaylitcs
- Admin user
-
Setup local development environment
- install mongo shell on OSX
brew install mongodb --with-openssl
- test connection
mongo "mongodb+srv://mflix-1hs5t.mongodb.net/test" --username analytics- setup python dev environment
-
Upload
.csvfile to cluster- download movie_initial.csv
- upload to the cluster using commandline
mongoimport --type csv --headerline --db mflix --collection movies_initial --host "<CLUSTER>/<SEED_LIST>" --authenticationDatabase admin --ssl --username <username> --password <password> --file movies_initial.csv
-
Setup MongoDB Compass
- Download here
- Connect to the cluster
-
Test connectionvia PyMongo
from pymongo import MongoClient
client = MongoClient("mongodb+srv://analytics:<PASSWORD>@mflix-1hs5t.mongodb.net/test?retryWrites=true")
db = client.mflix
print(db)The Aggregation framework is a set of analytics tools within MongoDB, that allow you to run various types of reports or analysis on documents in one or more MongoDB collections. The aggregation framework is based on the concept of a pipeline.
The idea with an aggregation pipeline is that we take input from a MongoDB collection and pass the documents from that collection through one or more stages,each of which performs a different operation on its inputs.
