/Data-Scientist-Udacity-Nanodegree-Term2

This repo contains the material and projects for Udacity Data science Nanodegree term 2

Primary LanguageJupyter Notebook

Data-Scientist-Udacity-Nanodegree-

Introduction

  • Udacity Data Scientist Nanodegree Term 2 projects and material for term 2

  • This repo contain the exercises, projects and the extra curricular material.

  • this is the link for the certificate: https://confirm.udacity.com/SPCUMMK6

Table of Contents

  1. Extra Curricular:

0.1: Convolutional Neural Network

0.2: Spark

Term 1

Link for term 1 repo

Term 2

Lessons

  1. Introduction to Data Science: Introduce the data science process and how to communicate results to stakeholders.

  2. Software Engineering: Best practices in software engineering plus Web development. tools: python, flask, heroku, unittests.

  3. Data Engineering: building ETL, NLP, and machine learning pipelines.

4.Experimental Design & Recommendations:design experiments and analyze A/B test results. Explore approaches for building recommendation systems.

Projects

  1. Write a Blog Post: Analysis for Amsterdam Airbnb listings following the CRISP-DM process. tools: python, sklearn, numpy, pandas, matplotlib.

  2. Disaster Response Pipelines: analyze disaster data from Figure Eight to build an NLP model for an API that classifies disaster messages. The first part of this project consisted of creating an ETL pipeline with data stored in SQL database. Then used the NLP model to categorize these events to reach out for a relevant relief agency. Finally, Flask was used to create the API. *tools: python, NLP, sklearn, numpy, matplotlib, FLASK, ETL.

  3. Recommendations with IBM: Built a recommendation engine for IBM Watson platform based on user behavior and social network data, to surface content most likely to be relevant to a user. This project consisted of building various types of recommendation engines such as rank-based, user-user collaborative filtering, and matrix factorization.

  4. Building a Promotional Strategy for Starbucks Customers: Used uplift modeling techniques to identify which groups of customers are most responsive to each type of offer. The data mimic Starbucks customers' behavior on the rewards mobile app. This required data preprocessing, RFM clustering(Kmeans), and xgboost to build the models. tools: python, sklearn, data visualization,numpy, pandas,RFM clustering, uplift modeling, xgboost

Licence

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License. Please refer to Udacity Terms of Service for further information.