DOI

Core Competencies in
Computational Data Science

This document is an attempt to delineate the Core Competencies of a computational data-science curriculum. Our target audience is instructors and educators who teach computational data-science practitioners, which may include scientists, students, engineers, or other professionals using computers to create, process, or analyze data. Much has been written on this topic before, usually in the form of "best practices" and exhortations for practitioners, which places responsibility with the practitioner for educating themselves. We are, instead, addressing educators and attempting to provide a guide for the design of computational data-science curricula.

Our approach is problem-based: Computational data-science practitioners regularly encounter challenges [1] in managing large and diverse datasets, installing and maintaining research software, sharing results, ensuring reproducibility, and developing high-quality research software. We humbly advance these Core Competencies as potential skills and practices that will help practitioners overcome these challenges. Core Competencies are articulated as Learning Outcomes with diverse representations; i.e., there is often more than one solution to a single problem, and we hope to present multiple strategies for addressing problems so that educators have multiple options for connecting with learners.

[1] Noble, W. S. 2009. A quick guide to organizing computational biology projects. PLoS Computational Biology 5 (7):e1000424.

Contributions

Contributions from educators, students, or computational data-science professionals are welcome and encouraged. Filing an Issue or submitting a Pull Request are two ways that you can contribute.

Acknowledgments

Work on this document by K. Arthur Endsley was supported by a grant from NASA's Transition to Open Science (TOPS) Training program (80NSSC23K0864).