/mongodb-analytics

mongodb-analytics tools

Primary LanguagePythonMIT LicenseMIT

MongoDB Atlas

Setup Environment

  1. Setup MongoDB Atlas Cluster

    • Cluster name - mflix
    • Project name - anaylitcs
    • Admin user
  2. Setup local development environment

    • install mongo shell on OSX
    brew install mongodb --with-openssl
    • test connection
    mongo "mongodb+srv://mflix-1hs5t.mongodb.net/test" --username analytics
    • setup python dev environment
  3. Upload .csv file to cluster

    mongoimport --type csv --headerline --db mflix --collection movies_initial --host "<CLUSTER>/<SEED_LIST>" --authenticationDatabase admin --ssl --username <username> --password <password> --file movies_initial.csv
  4. Setup MongoDB Compass

  5. Test connectionvia PyMongo

from pymongo import MongoClient

client = MongoClient("mongodb+srv://analytics:<PASSWORD>@mflix-1hs5t.mongodb.net/test?retryWrites=true")
db = client.mflix
print(db)

Aggregation Framework

The Aggregation framework is a set of analytics tools within MongoDB, that allow you to run various types of reports or analysis on documents in one or more MongoDB collections. The aggregation framework is based on the concept of a pipeline.

Pipeline

The idea with an aggregation pipeline is that we take input from a MongoDB collection and pass the documents from that collection through one or more stages,each of which performs a different operation on its inputs.