jtleek/modules

maacs dataset

alfakini opened this issue ยท 24 comments

Hi, where could I get maacs dataset? I see you use it on 04_ExploratoryAnalysis/ggplot2/ggplot2_p2.Rmd, but the file isn't in the repository. I am coming from the Computing for Data Analysis on Coursera, there are a lot of people there trying to find the maacs dataset to try the examples presented on the lectures: https://class.coursera.org/compdata-004/class/search?q=maacs#11-state-query=maacs&11-state-filter=all&11-state-page_num=2

Thanks,
alf.

Just a second here that a link or reference to the data would be nice.

Okay, here goes:

The data is in this RDS file: https://github.com/jtleek/modules/blob/master/04_ExploratoryAnalysis/PlottingLattice/maacs_env.rds

This file opens the data are draws a big set of panels of the level of allergen in the air for each subject over 5 visits (lines 201-204): https://github.com/jtleek/modules/blob/master/04_ExploratoryAnalysis/PlottingLattice/index.Rmd

The MAACS dataset has no "eno" field as shown in ggplot2 slide (Roger Peng' Exploratory Data Analysis course). Have any ideas why ?

You may load the whole data set with eno from here:

https://github.com/TarekDib03/ExploratoryDataAnalysisCoursera/blob/master/maacs.Rda

Let me know if you get any questions!

Thanks TarekDib03. It's working now.

Thanks TareDib03!

Thanks TarekDib03.

Anytime guys! I am glad I was able to help.

On Wed, Aug 13, 2014 at 5:10 PM, Matt Whitaker notifications@github.com
wrote:

Thanks TarekDib03.

โ€”
Reply to this email directly or view it on GitHub
#82 (comment).

I appreciate your upload. This will help me practice with the ggplot2 videos.

In the dataset I'm not finding logpm25, NocturnalSympt and also few other attributes..

I also want to point out that data used in lecture have no corresponding dataset we can use, and data set prowided above does not contain some of the columns that where shown in lecture

Jtleek has used maacs dataset for ggplot2 example.. but above link do not have attributes used in it. attributes like bmicat, NocturnalSympt, logpm25.. do anyone have any idead wr to find it..

Do we have any updates here?

There are some columns missing in the maacs data set, more specifically those needed for 3 - 7 - ggplot2(part 5)[8_11].mp4 video.

I found the data set in here: https://github.com/TarekDib03/ExploratoryDataAnalysisCoursera/blob/master/maacs.html
But I cannot load the data set with the error message: "more columns than column names"

try this code to save all columns from the original data set (https://github.com/jtleek/modules/blob/master/04_ExploratoryAnalysis/PlottingLattice/maacs_env.rds):
env <- readRDS("maacs_env.rds")
id <- 1:750
maacs <- data.frame(id, env)
save(maacs, file = "maacs.rda")

Jtleek has used maacs dataset for ggplot2 example.. but above link do not have all attributes used in it. For example, attributes like bmicat, NocturnalSympt, logpm25, and so on. Does anyone have chance to get it?

Best as I can tell, you can't get the dataset that contains bmicat and NocturnalSympt for this part of of the lecture. I found the quote below from http://lib.psylab.info/files/Peng2015b.pdf

"NOTE: Because the individual-level data for this study are protected by various U.S.
privacy laws, we cannot make those data available. For the purposes of this chapter, we
have simulated data that share many of the same features of the original data, but do not
contain any of the actual measurements or values contained in the original dataset."

If someone finds this dataset, I'd love to recreate the plots in the second part of the lecture.

Use THIS: https://github.com/lupok2001/datasciencecoursera/blob/master/maacs.Rda

The variables are simulated. When run with the R Code from the ggplot2 lectures it gives slightly different graphs, but it works.

thanks lupok2001! it worked for me

Thanks TarekDib03, your contribution was very useful.

Thank you guys, is a great help to find the data here.....Otherwise cannot practice. Cheers

Thank you @lupok2001 , it works well for me!

Thanks @lupok2001. workes for me
download.file("https://github.com/lupok2001/datasciencecoursera/raw/master/maacs.Rda",
dest="./lecture/maacs.Rda",mode="wb")
load("./lecture/maacs.Rda")

I loaded it with these steps:
url = "https://raw.githubusercontent.com/lejarx/MAACS-dataset/master/maacs.rda" destfile = tempfile(fileext = ".rda") download.file(url, destfile, method = 'libcurl', mode = "wb", quiet=TRUE) load(destfile) unlink(destfile)