The Draft of the plan is https://grants.nih.gov/grants/rfi/NIH-Strategic-Plan-for-Data-Science.pdf
The RFI is https://grants.nih.gov/grants/rfi/rfi.cfm?ID=73
The appropriateness of the goals of the plan and of the strategies and implementation tactics proposed to achieve them
A major strategic plan should begin with well-defined ultimate objectives, success criteria, and keen insight supported by research and analysis into the most pressing root causes of the inefficiencies in the current practices of data management across relevant disciplines.
The proposed plan does not appear to treat data management as a serious subject worthy of a critical approach. In the introduction, the authors even dismiss data management as a nuisance that takes scientists' attention away from more worthwhile and creative pursuits. This lack of value placed on data organization belies some of the root causes entirely missed by the plan.
As biosciences march toward increasingly data-centric collaborative projects, data become principal products of research and the role of data science is increasing in the science lab. In small projects, there is little incentive for individual researchers to invest in good management processes until publication. Collaborative science will require a cultural shift placing greater value on data integrity and data science education. As the data complexity increases, the data pipeline will need to be planned and designed alongside and with no less attention and rigor than the design of experiment. Scientists will need to be well versed and equiped in questions of data organization, security, accessibility, and integrity. Teams of scientists will need to craft and update their data pipelines to reflect the evolving structure of their experiments and analysis.
Implementing such collaborations requires prioritizing data organization as part a core research activity, changing the social contract and the incentive structure for scientists who collect, organize, and process data for timely analysis by the broader community.
Instead, the current plan pursues only two rather tangential issues such as improving repository infrastructure and enforcing data submission standards for publication. Unfortunately, these solutions alone will do little to encourage broad data-centric collaborations to accelerate discovery. The NIH strategic plan could become a force multiplier by defining measures to alter the incentive structure and to prodiving high-quality instructions to encourage scientists to set up good data organization principles in every aspect of their work as part of their creative process.