đź“• Documentation: Documentation: Dictionary.xml and DictionaryDescription.md of: eoActivity
Opened this issue · 9 comments
Here we describe the process of creating a [DictionaryName]DictionaryDescription.md document, within which we will describe the contents of the individual dictionary (named in the title of this Issue), which was created (or is in the process of being created) from data collected for Oil186.
I will begin this thread by pasting the contents of the INDEX description, then follwed by first draft copy below for discussion and direction.
EO Activities
ActivityDictionaryDescription.md
- Description: A dictionary of **the names of 438 essential oil or constituent compound biochemical and/or biological activities, 340 of which resolved to wikidata IDs, and 336 with short descriptions.
- Filename: activity.xml
- File Location: https://github.com/petermr/CEVOpen/blob/master/dictionary/activity/activity.xml
Activity​​ Dictionary
A dictionary of 184 activities mentioned in the 186 test articles downloaded from PubMed.
File Data
-
Filename: activity.xml
-
File Location: https://github.com/petermr/CEVOpen/blob/master/dictionary/activity/activity.xml
Table Column Headings
-
title: type of data to be normalized and tagged with Wikidata ID.
-
desc: data source
-
id: CM.activities.n where n is a serialized number
-
name: The name is a human readable string describing the concept.
-
term: The term is the precise string used to identify the concept. Name and Term are often the same.
-
wikidata: Unique identifier linked to Wikidata.org — a free and open knowledge base that can be read and edited by both humans and machines.
-
wikipedia:
Contents/Results
-
No. of source papers: 186
-
No. of Entries (Headers are not counted): 184
-
No. of unique compound names (including alternate spellings or synonyms): 184
-
No. of Chemical Compounds resolved in Wikidata: 74
-
No. of Chemical Compounds NOT resolved in Wikidata: 110
Notes:
-
No source papers are listed. Should we assume 186, or delete that from Contents/Results?
-
We need to normalize the headings across all Dictionaries
-
This is the third case where the column heading “description” means something other than "data source / method of input"
-
Capitalization
-
-
In this case, is the column heading “id” related to Essoil? I don’t know how to describe it here. The format is: CM.activities.n where n is a serialized number
-
I don’t know how to describe the column headings for “Wikipedia” here
@petermr Currently working on cleaning the activities.xml dictionary.
Searching Wikidata for “antiacne” I found this entry:
https://www.wikidata.org/wiki/Q143139 "therapeutic subgroup of the Anatomical Therapeutic Chemical Classification System: Anti-acne preparations”
which led me to search and find this:
https://www.wikidata.org/wiki/Q192093 "classification of active ingredients of drugs according to the organ or system on which they act and their therapeutic, pharmacological and chemical properties.”
and this: https://en.wikipedia.org/wiki/Anatomical_Therapeutic_Chemical_Classification_System
Questions:
-
In the absence of a wikidata ID for "antiacne", should I...
a) use no id at all
b) use https://www.wikidata.org/wiki/Q143139
c) use the ID for "acne" and let users put 2 and 2 together about the "anti-" part? -
should we be adding the Anatomical_Therapeutic_Chemical_Classification_System’s IDs to the activities dictionary as well as wikidata?
https://www.whocc.no/atc_ddd_index/
Incidentally, the WHO Collaborating Centre for Drug Statistics Methodology
also has useful ways to express the following, which may be useful as dictionaries as well.
Units
g | = gram |
---|---|
mg | = milligram |
mcg | = microgram |
U | = unit |
TU | = thousand units |
MU | = million units |
mmol | = millimole |
ml | = milliliter (e.g. eyedrops) |
Route of administration (Adm.R)
Implant | = Implant |
---|---|
Inhal | = Inhalation |
Instill | = Instillation |
N | = nasal |
O | = oral |
P | = parenteral |
R | = rectal |
SL | = sublingual/buccal/oromucosal |
TD | = transdermal |
V | = vaginal |
Ok, I will add new entries as I go. If too time-consuming, I’ll swing back and do it after the dictionaries are cleaned, and then update them accordingly
Sent with GitHawk
I have just finished uploading the cleaned, disambiguated and Wikidata attributed activities dictionary, and updated it's description, as well as the master INDEX of descriptions.
ActivityDictionaryDescription.md
-
Description: A dictionary of **the names of 438 essential oil or constituent compound biochemical and/or biological activities, 340 of which resolved to wikidata IDs, and 336 with short descriptions.
-
Filename: activity.xml
-
File Location: https://github.com/petermr/CEVOpen/blob/master/dictionary/activity/activity.xml
Hallelujah.
activity.xml and ActivityDictionaryDescription.md are now updated and working.
I have also updated master INDEXofOIL186Dictionaries.md
As of today, I believe this dictionary and it's description document are complete. Below I will copy the contents of the description document:
EO Activity​​ Dictionary
File Data
-
Description: A dictionary of 438 essential oil or constituent compound biochemical and/or biological activities, 340 of which resolved to wikidata IDs, and 336 with descriptions of 250 characters or less.
-
Filename: eoActivity.xml
-
File Location: https://github.com/petermr/CEVOpen/blob/master/dictionary/eoActivity/eoActivity.xml
Table Column Headings
-
id: serialized identification number
-
term: The name is a human readable string describing the concept.
-
wikidataID: Unique identifier linked to Wikidata.org — a free and open knowledge base that can be read and edited by both humans and machines.
-
description: short description of the activity sourced from wikidata and/or wikipedia
Contents/Results
-
No. of source papers: 186
-
No. of entries (Headers are not counted): 438
-
No. of unique activity names (including alternate spellings or synonyms): 438
-
No. of activities resolved in wikidata (including alternate spellings or synonyms): 340
-
Number of unique wikidata ids attributed to activities (normalizing for alternate spellings and synonyms): 250
-
No. of entries withoug wikidataid: 98
-
No. of entries with descriptions: 336
-
No. of entries without descriptions: 102