ncss-tech/SoilKnowledgeBase

Converting Part 629 Ex. A into JSON artifacts

dylanbeaudette opened this issue · 3 comments

A collection of code used to generate potential JSON artifacts to support linking to the GDS. Originally created to support work in SoilWeb and as a demonstration to the soil ontology crowd. Output currently generated from static (TXT) clips from 629 ex. A (minor editing for typos), but should be based on the GDS data provided by SKB.

PR: #27

Example output.

"back-barrier flat": {
    "def": [
      {
        "text": "A subaerial, gently sloping landform on the lagoon side of the barrier beach ridge composed predominantly of sand washed over or through the beach ridge during tidal surges; a portion of a barrier flat.",
        "compare": "Compare – washover-fan flat.",
        "sources": ["SS", "SSS"]
      }
    ]
  }

This has been addressed. The "GDS glossary" terms are parsed alphabetically into JSON files stored in the inst/extdata/NSSH/629 folder

I added several more data elements that were parseable from 629:

  • whether the term is considered colloquial
  • where the term is used,
  • if it is considered "obsolete"
  • and preferred terms (if any)

For instance:

[
{
"term": "backshore terrace",
"text": "(not preferred) Refer to berm.",
"compare": null,
"sources": null,
"colloquial": false,
"colloquloc": [],
"obsolete": true,
"preferred": "berm"
}
],

or

[
{
"term": "ballena",
"text": "(colloquial: western United States) A fan remnant having a distinctively-rounded surface of fan alluvium. The ballena’s broadly rounded shoulders meet from either side to form a narrow summit and merge smoothly with concave side slopes and then concave, short pediments that form smoothly rounded drainageways between adjacent ballenas. A partial ballena is a fan remnant large enough to retain some relict fan surface on a remnant summit.",
"compare": null,
"sources": ["FFP", "SW"],
"colloquial": true,
"colloquloc": "western United States",
"obsolete": false,
"preferred": []
}
],

629 gets pretty particular about the meaning of preferred, obsolete etc. so the full nuance of each entry may not be captured, and I think there are a couple colloquial locations misparsed that need addressing

This is great, thanks for the substantial improvements.

Good to close then, yes?