Output Format and Evaluation for Family History Extraction

Subtask 1: Entity Identification

For the first subtask, we would like you to extract two types of entities:

family members mentioned in the text
observations (diseases) in the family history

The participants should also provide side of family (i.e. Paternal, Maternal, NA) for each family member.

The possible family members in this task are:

Father, Mother, Parent, Sister, Brother, Daughter, Son, Child,
Grandmother, Grandfather, Grandparent, Cousin, Sibling, Aunt, Uncle.

Please do not include other relatives in your results (e.g. nephew).

Note

For first degree relatives (e.g. parents, children, siblings), the side of family should always be "NA".
To reduce the ambiguity in observation extraction, we accept partial matching of the observations. For example, an extraction of "diabetes" in the phrase of "type 2 diabetes" will be considered as a true positive when calculating F1 score. However, each observation should contain no more than 4 tokens.
To reduce the complexity of the task, the negation information is removed from evaluation for both subtasks.

Output Format

Within a single file, the fields are delimited by tabs.

doc_id  FamilyMember family_member   SideOfFamily
doc_id  Observation  text of observation

Examples:

doc_1	FamilyMember	Brother	NA
doc_1	FamilyMember	Grandfather	Paternal
doc_1	FamilyMember	Father	NA
doc_1	Observation	conduction disorders

Subtask 2: Family History Extraction

In the Subtask 2, the participants need to extract the relations between family members, observations and living status. The purpose of this subtask is to evaluate participating systems as end-to-end family history summarization systems based on clinical texts.

The output file should be in TSV format which the columns are:

Family member
Side of family
Living status
Observation

In cases where there are more than one observation for one family member category, the systems should provide those observations in separate rows.

[Notice!!] Slightly different from the task in 2018, in the challenge of this year, the participants will need to detect negation for observations. Specifically, "Negated" and "Non_Negated" should be given after each observation. Please refer to the example below for more detailed explanations.

Output Format

Within a single file. Fields are delimited by tabs.

doc_id  family_member    side_of_family    LivingStatus    living_status_score
doc_id  family_member    side_of_family    Observation    text_of_observation1 Negated
doc_id  family_member    side_of_family    Observation    text_of_observation2 Non_Negated

Examples:

doc_1	Brother	NA	LivingStatus	4
doc_1	Father	NA	LivingStatus	4
doc_1	Grandfather	Paternal	LivingStatus	0
doc_1	Cousin	Paternal	Observation	 phaeochromocytoma Non_Negated
doc_1	Cousin	Paternal	Observation	choreic dysphonia Non_Negated

Living Status Score

We use only one score to represent living status for each individual.

For both "Alive" and "Healthy" properties, the results are encoded as:

Yes: 2
NA: 1
No: 0

And the overall score for each individual are the alive score times healthy score.

For example:

For a relative with "Alive: Yes" and "Healthy: Yes", the living status score should be 2 * 2 = 4.

For a relative with "Alive: No" and "Healthy: NA", the living status score should be 0 * 1 = 0.

Note

If multiple relatives are under the family member category, (e.g. multiple maternal aunts) with different living status scores, use the minimum of the scores as the final score for that category.
To be considered as a correct prediction (True Positive) for family members, all of the fields have to be matched, including living status.
The observation matching criterion is the same as subtask 1, where partial matching is allowed.
For Subtask 2, conditions applied to all relatives should not be included. For example, in the sentences of "There were no reports of mental retardation. ", the observation of "mental retardation" should not appear in any family members.

Evaluation

We use standard F1-score as the evaluation (ranking) metrics.

Specifically,

Precision = TP / (TP + FP)
Recall = TP / (TP + FN)
F1 = 2 * Precision * Recall / (Precision + Recall)

Please run the evaluation script between your output file and the gold standard TSV file (sent separately) to check your system performance.

python eval.py subtask path/to/gold_standard path/to/your/prediction

The subtask field is either 1 or 2. For instance, for subtask 1:

python eval.py 1 gs_subtask1.tsv path/to/your/subtask1/prediction

You will find your system performance in your command line console.

TP: 18
FP: 1
FN: 3
Precision: 0.9696969696969697
Recall: 0.9142857142857143
F1: 0.9411764705882354

Contact

Please find more information on our challenge page: https://n2c2.dbmi.hms.harvard.edu/track2

Acknowledgment

Please kindly cite our task overview paper as:

Sijia Liu, Majid Rastegar-Mojarad, Yanshan Wang, Liwei Wang, Feichen Shen, Sunyang Fu, Hongfang Liu. Overview of the BioCreative/OHNLP 2018 Family History Extraction Task. BioCreative/OHNLP 2018 Workshop Proceedings [PDF]

OHNLP/n2c2_fh

Output Format and Evaluation for Family History Extraction

Subtask 1: Entity Identification

Note

Output Format

Subtask 2: Family History Extraction

Output Format

Living Status Score

Note

Evaluation

Contact

Acknowledgment