- Data Visualization and analytics of Kick Starters in 2018
- CS 544 Fall 2018 - Foundations of Analytics with R | Boston University Metroplitan College
Download the data set from Kaggle here
Download Presentation file here
Download Video Presentation file here
Download R script here
Look into the following sites as an example and select a data set that interests you.
- https://www.kaggle.com/datasets
- http://www.kdnuggets.com/datasets/index.html
- Any other source of your choice
- Import the data set into R.
- Document the steps for the import process and any preprocessing had to be done prior to or after the import. Any R code used in the process should be included. Analyzing the data
- Do the analysis as in Module 3 for at least one categorical variable and at least one numerical variable. Show appropriate plots for your data.
- Do the analysis as in Module 3 for at least one set of two or more variables. Show appropriate plots for your data.
- Pick one variable with numerical data and examine the distribution of the data.
- Draw various random samples of the data and show the applicability of the Central Limit Theorem for this variable.
- Show how various sampling methods can be used on your data. What are your conclusions if these samples are used instead of the whole dataset.
- You need to record your project presentation and submit it as well.
- Each presentation is for at most 10 minutes.
- The project will be due on Sunday, December 16th, 11:59 PM EST.