It is assumed that you have read Sections 5.5 - 5.6 from R4DS and completed the Derive Information with dplyr Primer.
In this activity, you will:
- Produce numerical summaries of variables using
{dplyr}
. - Produce numerical summaries of variables by a grouping variable
using
{dplyr}
. - Compute new variables in a dataset using
{dplyr}
.
Remember that more detailed directions can be found in Task 1 of Activity 4.
Fork this repo and clone it to a new RStudio Project
Planned Pause Point: If you have any questions, contact your instructor or another group. We will complete this Activity during our next class sessionThe activity05-data-summarization.Rmd
file contains the directions for
this activity. For the rest of this class period, you will complete the
RMarkdown document with your neighbor(s). Your instructor will be
circling and be available to help when needed.
Note that each person is working in their own repo. We are not worrying about collaborating for the time being and instead will be working on being more comfortable with the workflow for working between RStudio and GitHub.
However, do not continue in this README document until you and your
neighbor(s) have completed your .Rmd
files.
We now have a number of skills to help us explore datasets. Before we add too many more tools/skills, we should verify what we have currently learned. Look at the Course Objectives that we came up with at the beginning of this semester. Take 5 minutes to identify which of these you feel comfortable with. How could you demonstrate what you have learned?
Now, think back through the 5 Activities that we have completed. What is still not clear? What will you do to better understand these tricky items?
Next: Activity 6 will focus on restructuring data to be easier for humans to read or easier for computers to handle.