MongoDB for Data Science Projects

This repository contains two projects that leverage MongoDB for data science applications. Both projects showcase the powerful integration of MongoDB with data analysis and machine learning workflows.

Projects Overview

1. NYC Taxi Trips Data Analysis

Objective:
Perform an in-depth analysis of the NYC Taxi Trips dataset to uncover insights and trends related to taxi operations, passenger behavior, and geographical patterns.

Key Features:

  • Data Exploration: Loading and exploring large-scale datasets with MongoDB.
  • Data Aggregation: Utilizing MongoDB's aggregation framework to perform complex queries and data summarization.
  • Visualization: Creating meaningful visualizations to represent findings, such as trip distributions, revenue patterns, and peak hours.

Outcome:
Generated valuable insights into the taxi operations in NYC, helping understand peak times, revenue patterns, and the impact of geographical factors on taxi services.

2. Car Price Prediction Using Machine Learning

Objective:
Build a machine learning model to predict car prices based on various features such as make, model, year, mileage, etc., using MongoDB as the backend database.

Key Features:

  • Data Preprocessing: Efficiently storing, retrieving, and preprocessing car-related data using MongoDB.
  • Model Building: Implementing regression models to predict car prices with high accuracy.
  • Model Evaluation: Evaluating model performance using metrics like Mean Absolute Error (MAE) and R-squared.

Outcome:
Achieved a predictive model capable of estimating car prices with a good degree of accuracy, demonstrating the seamless integration of MongoDB with machine learning pipelines.

Getting Started

Prerequisites

  • Python 3.x
  • MongoDB installed locally or accessible via cloud
  • Required Python packages listed in requirements.txt

Installation

  1. Clone the repository:

    git clone https://github.com/MisbahullahSheriff/mongodb-with-data-science.git
  2. Install the required dependencies:

    pip install -r requirements.txt
  3. Ensure MongoDB is running and accessible.

Usage

  • Navigate to the respective project directories to find detailed instructions on how to run the code.
  • Use Jupyter notebooks provided for step-by-step execution.

Connect