petermr/CEVOpen

đź“• Documentation: Dictionary.xml and DictionaryDescription.md of: eoPlantPart

Opened this issue · 5 comments

Here we describe the process of creating a [DictionaryName]DictionaryDescription.md document, within which we will describe the contents of the individual dictionary (named in the title of this Issue), which was created (or is in the process of being created) from data collected for Oil186.

I will begin this thread by pasting the contents of the INDEX description, then follwed by first draft copy below for discussion and direction.

Plant Parts

The plant part or parts from which the mentioned oils are extracted

 

PlantPartsDictionaryDescription.md

Plant​ Parts​​​ Dictionary

 

A dictionary of [XX] part(s) of a plant from which Essential Oils — mentioned in the 186 test articles downloaded from PubMed — were extracted.

 

File Data

 

Table Column Headings

  • title: type of data to be normalized and tagged with Wikidata ID. In this case, “plantParts"

  • description: Short description of the plant part being identified in that row

  • id:

  • name: a human readable string describing the concept.

  • term: the precise string used to identify the concept. (Name and Term are often the same.)

  • wikidata: Unique identifier for each normalized dictionary term, linked to Wikidata.org — a free and open knowledge base that can be read and edited by both humans and machines.

  • wikipedia:

  • query:

 

Contents/Results

  • No. of source papers: ??

  • No. of Entries (Headers are not counted): 18

  • No. of unique entries (including alternate spellings or synonyms): 18

  • No. of Chemical Compounds resolved in Wikidata: ????

  • No. of Chemical Compounds NOT resolved in Wikidata: ???

 

Notes:

More work needs to be done on this dictionary.

Errors?

  • This is the first case where the column heading “description” means something other than "data source / method of input"

  • In this case, is the column heading “id” related to Essoil? I don’t know how to describe it here. The format is: CM.plantParts.n where n is a serialized number

  • I don’t know how to describe the column headings for “Wikipedia” or “query” in this case

Currently, the plantparts.xml data is sparse.
I found this list (https://www.collinsdictionary.com/word-lists/plant-parts-of-plants) that provides many more entries, and will incorporate them into the dictionary, along with WikidataIDs where available. As a placeholder, I've created plantParts20200222.xlsx in the CVEOpen/dictionary/plantparts/ directory, and pasted the list from above to work with.

@petermr I will likely need Gita to verify or supply a better source for this list of terms.

Plant Parts Dictionary is now complete and online

Next I'll update the results data for it's dictionary description .md file

As of today, I believe this dictionary and it's description document are complete. Below I will copy the contents of the description document:

EO Plant​ Part​​​ Dictionary

File Data

 

Table Column Headings

  • id: serialized identifier

  • name: a human readable string describing the concept.

  • term: the precise string used to identify the concept. (Name and Term are often the same.)

  • wikidata: Unique identifier for each normalized dictionary term, linked to Wikidata.org — a free and open knowledge base that can be read and edited by both humans and machines.

  • description: Short description of the plant part being identified in that row

 

Contents/Results

  • No. of Entries (Headers are not counted): 285

  • No. of unique entries (including alternate spellings or synonyms): 285

  • No. of entries resolved in Wikidata: 231

 

Notes: