osd_to_json: rare incorrect labeling in output
brownag opened this issue · 2 comments
Incorrect labeling due to bad order and/or missing groups. The algorithm is likely getting confused somehow, as this is handled correctly even when things are out of order a very high percentage of the time.
It may be that the only way to fix the edgiest of cases is with some sort of post-processing of content versus the parsed results -- but I feel like when I look and it will be something wrong with the implementation.
An example is the ZADE series OSD JSON where everything after Geographic Setting is out of order [offset by one]
SoilKnowledgeBase/inst/extdata/OSD/Z/ZADE.json
Lines 26 to 54 in f2dfb54
If we pull up the OSD nothing pops out as being immediately wrong with it... until you see that GEOGRAPHICALLY ASSOCIATED SOILS is missing. They use a long list-form Competing section -- I suppose instead? -- pretty nice.
https://github.com/ncss-tech/OSDRegistry/blob/main/OSD/Z/ZADE.txt
The code at one point concatenated a list of numeric index vectors -- some of which could be zero length...
w/ 2ea007a vector is properly buffered with NA
-- which makes subsequent stuff work right -- I think.
Going to chance it and run the refresh-extdata Action and see what changes... in a branch.