Developing a continuous benchmarking envrionment for NLP de-id methods.
The widespread adoption of Electronic Health Records (EHRs) has enabled secondary use of EHR data for clinical research and healthcare delivery. As much of the detailed patient information is recorded in clinical narratives, unlocking information from clinical narratives and integrating such information with structured EHR data become critical for EHR-based studies. PHI information in clinical narratives becomes a barrier in conducting EHR-based clinical research and sharing the research data across sites.
- Create a cloud-based environment that enables the systematic validation of text analytics tools to solve specific tasks (i.e. the “NLP Sandbox”).
- Populate the “NLP Sandbox” with appropriate reference data sets to be used in shared validation tasks.
- Engages CTSA hubs to contribute tools and methods to the project and demonstrate their performance, reproducibility, and rigor in such a shared environment
- Tools and Cloud Infrastructure
- Next Generation Data Sharing
- Informatics Maturity and Best Practices
Point person (github handle) | Site | Program Director |
---|---|---|
Justin Guinney (@jguinney) | Sage Bionetworks | Melissa Haendel (@mellybelly) |
Project scientific leadership, should be 1-3 persons.
Leads (github handle) | Site |
---|---|
Thomas Schaffter (@tschaffter) | Sage Bionetworks |
James Eddy (@jaeddy) | Sage Bionetworks |
Members (github handle) | Site |
---|---|
Thomas Schaffter (@tschaffter) | Sage Bionetworks |
Yao Yan (@yy6linda) | Sage Bionetworks |
Yooree Chae (@ychae) | Sage Bionetworks |
James Eddy (@jaeddy) | Sage Bionetworks |
Justin Guinney (@jguinney) | Sage Bionetworks |
George Kowalski (@gkowalski) | MCW |
Bradley Taylor (@btaylormcw) | MCW |
Tom Dillon (@tmdillon) | WashU |
Resource | Link | Site |
---|---|---|
GitHub team | nlp-team | CD2H |
GitHub project | data2health/projects/7 | CD2H |
Google folder | NLP Sandbox | CD2H |
Slack channel | CD2H workspace / nlp-sandbox | CD2H |
Access to resources is limited to onboarded participants (CD2H Onboarding Form).
We encourage the community to get involved. Please make tickets or provide comments.