carpentries-incubator/r-ml-tabular-data

ML contextual issues to consider

Opened this issue · 1 comments

Consider adding exposition on the following topics, mostly dealing with the context of ML. (h/t Greg Janee)

  • When is ML appropriate? When is it not? How can you assess if ML is/will be a viable approach for a particular problem?
  • ML and ethics
  • What are some considerations regarding data analysis and preparation prior to throwing data in the ML blackbox and seeing what comes out?
  • Some discussion on structure of an xgboost or random model (e.g., list of trees)
  • Reserve time at the end to ask students how they might employ what they've just learned in their own research area. Not all students will have an answer, of course. But this could foster a very interesting discussion, and help close the gap between the student walking out the door of the workshop and later being able to apply what they learned.

Regarding the appropriateness of ML, the chart on page 25 of ISLR is helpful:
https://www.statlearning.com/
Upshot: More flexible ML models can make better predictions, at the expense of being interpretable.