ropensci/unconf18

Connecting R educators "in the wild"

Opened this issue Β· 14 comments

I know it is late to propose a new project, so consider this an invite to talk with me more about this idea at unconf if it resonates with you!

Problem:
People who teach R create awesome R markdown, blogdown, and bookdown materials for teaching, most of which are stored on GitHub. But, they can be hard to find (everyone knows @STAT545-UBC, but discoverability of even these materials is low for people not fully steeped in #rstats). The tidyverse site has some links to courses, but the materials are variable: some are PDF syllabi, some are full repos, some are formal university course listings.

Idea:
Inspired by @batpigandme's idea (#48), I've been thinking of a website to aggregate existing educational materials from GitHub. Ideally, one could search GitHub for repos that include words in the title/tag/README like "curriculum", "course", "workshop", "bootcamp", and tag them as such (I want to catch repos like @hadley's Data Challenge Labs: https://github.com/dcl-2017-04/curriculum). Other items on my "would be nice" list:

  • Tag with blogdown, bookdown, or R markdown site
  • Tag with type of license, if there is one, re: reuse/attribution/etc.
  • Provide a "tidyverse" percentage: something like, of the packages loaded in the repo, what percent are in the tidyverse ecosystem?
  • Provide way to see "last updated" easily, and perhaps in navigable interface allow users to sort by this
  • Some kind of topic tagging: like statistics, machine learning, data science, data visualization, natural language processing, etc.
  • Perhaps a level tag, like undergrad, grad, K-12, etc.

Selfishly, I would find this type of resource very useful! But past-me would have found it invaluable. I frequently see professors in my own computer science group using Matlab for example because they don't know how to start teaching material they know using a language they don't know. It would be great to be able to forward them to courses on machine learning using R, for example. Just overheard yesterday a student lamenting that all course materials for ML are in Matlab, the TA only knows python, and she wants to use R, so I think this could also help students.

More broadly, I would love to establish an educator's collaborative around teaching R or with R. My university created one, and they worded it so nicely I'm just going to plagiarize:

"The Educators' Collaborative (EC) is a community of practice for people who are interested in education, including direct teaching, innovation, scholarship, curriculum design and mentoring. A community of practice is a group of skilled practitioners who interact regularly to learn from and with one another for the purpose of professional and personal development. Through in-person or online engagement, they create a shared understanding of purpose and develop communal resources to enhance their respective practices. (Lave & Wenger, 1991; Wenger, 1998)."

I have increasingly been working on team-taught courses and see real value in collaborating on curricula with other R educators. But not everyone has this luxury- it would be great to provide an organization to support innovative R education efforts.

Tagging folks that @stefaniebutland tagged on the Slack channel for interest/involvement in education:
@jennybc @laderast @hadley @jtr13 @czeildi @elinw @seankross @aurielfournier (I can't find Jenny Draper on GitHub, so I'm sorry for not tagging here!)

I think this is a brilliant idea.

I do think the whole issue of cataloging resources is really hard though. As you point out, there's such a mix of content, even on a very short and curated collection:

The tidyverse site has some links to courses, but the materials are variable: some are PDF syllabi, some are full repos, some are formal university course listings.

Personally, I've found examples for whole-semester material, like @jennybc 's 545 and @hadley's data challenge labs, immensely influential to my own approach, because it is nice to see how the parts fit together, in what order, and guided by what philosophy (@hadley's analogies with learning baseball still stick with me). But then I'm also trying to teach a semester-length course, so that's my bias as well (my blogdown-based site is https://espm-157.carlboettiger.info/ )

I know @lwasser and @coatless have both thought a lot about the issue of cataloging resources in this space, so tagging them here.

Neato. Another project I am on (CTSA data to health, CD2H) is trying to do this in terms of data science competencies and finding paths through different courses, as well as assessing any possible gaps. As @cboettig mentioned, it's a very difficult process, and the question is how to make it dynamic, since compentencies change. I'd love to talk to everyone.

(edit: I'd like to talk with everyone - but I removed the description of the project. don't want to dominate this conversation)

@cboettig Thank you, yes- my main motivation is to collate long-form materials, of which your ESPM course is currently living in my unwieldy bookmarks folder of inspiring courses! And I agree- seeing others' whole semester material has shaped so much of how and what I teach. I think having community discussion forums like R4DS and https://community.rstudio.com/c/teaching are also invaluable, but there is a gap to fill. The Carpentries have great content resources, but they tend to be short-form. I do think what is lacking is field-tested quarter/semester-long materials with integrative syllabi, labs, homeworks, grading rubrics, datasets, good in class activities, project ideas, etc. (@jennybc's @STAT545-UBC being probably the main exception). As you know, a lot of blood/sweat/tears goes into designing curricula, especially the flow and sequencing for whole courses, with important differences compared to short-form tutorials or code-throughs (which I find really helpful as a learner and teacher, and @batpigandme's #48 may improve discoverability of those materials).

So you got me thinking more about the elements of a community of practice:

  • The domain:
    • R
  • The community:
    • Definition: In pursuing their interest in their domain, members engage in joint activities and discussions, help each other, and share information. They build relationships that enable them to learn from each other; they care about their standing with each other.
    • Current communities:
  • The practice:
    • Definition: Members of a community of practice are practitioners. They develop a shared repertoire of resources: experiences, stories, tools, ways of addressing recurring problemsβ€”in short a shared practice.
    • Current education + R-based practices:
      • Carpentries instructors perhaps? But this is for workshop formats; need one for "whole course" practitioners.
      • Some university departments may have this, but you need a quorum. There are lots of faculty who are little R islands in their department/university (this is me!).

So, I think we have the first two, but the third element of practice specific to education is the missing piece. In the short-term, having some kind of navigable course repository would allow members to reuse assets- which would be a great start πŸŽ†

In the long-term, an educational collaborative could aim to:

  • Problem solve best practices for teaching certain topics/domains/tools
  • Seek experience from people who have already thought about this
  • Help other educators build an argument for funding to develop and teach new courses at their institution
  • Grow confidence for new educators
  • Map knowledge and identify gaps (this seems in line with @laderast's C2DH project, although probably a higher order goal once a community of practice is in full swing)

I have not heard about The Journal of Open Source Education, thank you for the link!

Also tagging my R education partners in crime: @ismayc @andrewpbray @rudeboybert @DJAnderson07, plus @kierisi and @jthomasmock to join in convo

@cboettig thanks for the ping. I'm more than happy to join a collective like this.

On Friday, May 18th, 2018, we'll start to open source some of the drawings used in STAT 385 @ UIUC this term in:

https://github.com/coatless/draw-r.

Note: The drawings have largely been done in Omnigraffle or Keynote ( cc'ing my inspiration / person I blame for my fascination in diagramming @hrbrmstr ). We're looking into the ability to generate drawings of R objects dynamically via base or ggplot2 graphics.

Outside of that, we'll be working this summer on releasing an R textbook covering "Statistical Programming Methods" or "Data Science Programming Methods" depending on the zeitgeist spirit:

https://github.com/coatless/spm

In the interim, consider some of the other education tech that we've built:

  • assignr: Tools for Educators Writing Assignments in RMarkdown (joint w/ @daviddalpiaz)
  • errorist: Automatic Error and Warning Search
  • dropcli: Dropbox CLI for working on a Linux environment within R.
  • rcpp-api: Unofficial Rcpp API Documentation (Lots of Examples!)
  • coursetools: Administrative Tools to Manage Online Courses (GitHub + Blackboard)
  • autograder: Automatically Grade Models (screenshot)

The later two are somewhat restricted to UIUC personnel at the moment.

I love this idea! (and thanks for including me in the convo!) Just a few thoughts:

Part of what I think can be difficult, both as a learner and as an instructor, is just the sometimes overwhelming amount of stuff out there. There are lots of good resources, but navigating what to teach (or learn) and when can be difficult. I would imagine a collaborative like this sort of serving as a curator of open-source teaching materials, and the wisdom of the community could help inform the topic sequence.

In terms of the content itself, I wonder if it might make sense to have people/organizations actually submit their materials to a repo for inclusion, rather than actively gathering existing materials (although gathering known materials of high quality would be a great starting point). Then, they could potentially even go through some sort of peer review process, which would (potentially) not only help the community (through the addition of the new content) but also the instructor (by getting feedback on their materials).

I also tend to think a lot about the match between the learner and the content. This may be a few steps down the line, but I wonder if, after the content was curated, it might be worth thinking about developing some sort of survey or even pre-tests that would ultimately recommend where to start, i.e., - what level of programming background do you/your students have? How much experience do you/your students have with statistics? This could potentially help align learner needs with content that is not too easy/difficult.

hey y'all, I'm a former high school science teacher with a lot of opinions about how we can integrate best practices from K-12 education as a means of equipping learners with the skills and confidence needed to teach themselves, with the ultimate goal being the (further) democratization of data science education (with R).

always happy to chat - this is an area I'm currently researching and actively beginning to develop resources in, many which will begin to find their way back into the R4DS online learning community (which I'm also always happy to talk about!)

Honored and many thanks for including me in this conversation!

I think as @DJAnderson07 mentioned, it is hard as a learner to sift through the blogs, Quora posts, SO answers, various package vignettes and figure out an efficient learning path. A curated, organized list of proper coursework would be huge for self-directed learners, many of whom may not have access to Data Science at the college level or even MOOCs such as Coursera, DataCamp, Udacity, etc.

All that being said, I lack the teaching expertise of likely everyone in this group, but am happy to contribute by reviewing or gathering material, and generally acting as a student advocate. Excited to see what comes of this!

A few of us started this effort a few years back including tracy teal and matt jones!! We started a prototype site but nothing ever came of it. Happy to be involved in the discussion and to share what we learned from it as well! Il post our beta website here when I'm back to my computer!

Found some great content from UC Berkeley's Data Science Program, unfortunately appears to be exclusively Python, but some great frameworks.

https://data.berkeley.edu/undergraduate-ds-pedagogy
http://data8.org/
http://data8.org/sp18/

@apreshill I think your idea would have a great positive impact; it would grow the R community in size and quality.

At the Smithsonian I see many excellent potential teachers and many interested potential learners. But the job of the potential teachers is not to teach, so they can spend very limited time on that.
You can't cut delivery time but you can cut preparation time. The pay off is huge and comes not from doing anything more than what we are already doing but from doing it differently -- just by keeping things organized.

Summary: The goal is to improve discoverability of existing #rstats education materials by identifying and aggregating those available in GitHub repositories. We could leverage meta-data that can be extracted from repos that may further improve discoverability (like last updated, where repo contains blogdown site or bookdown book, etc). There also seems to be interest in an educator's collaborative, which could be a hub for experienced and new educators to share not just resources but also ideas, advice, wisdom, etc.

Posting here in addition to Twitter. Might be of interest to the group. R Package in early development for aiding educators in building RStats material.

https://www.r-consortium.org/projects/awarded-projects

tools

Thanks @jthomasmock for sharing and congrats @fmichonneau, that sounds awesome!

Want to link here to our repo from #runconf18:

https://github.com/ropenscilabs/rOpenSciEd