Odd attributes table behaviour in index.html when description and unitText empty.
annakrystalli opened this issue ยท 9 comments
When attributes.csv
description
and unitText
empty, the index.html attributes table fields name
and description
are populated with the biblio.csv
title
and description
fields. ๐ค
Not had a chance to trace it. Will try to but just wanted to flag it.
The error is not generated in the json file as far as I can tell but when parsing it during build_site()
OK, I thnk I've located the issue in the template. There seems to be no name
field in variableMeasured
objects (sorry not too well versed in .json terminology, hope this makes sense). At the minute, I'm working on the dataspice
workshop I'll be running on tuesday and everything seems to work ok when the attributes.csv
table is completed apart from the attributes name
field is actually still pulling the title
from biblio
๐คทโโ๏ธ
In all honestly, I'm a little confused myself as to what name
and value
in the attributes table is supposed to show. Additionally, because sometimes we have duplicate entries of the same variable across different tables (better handling of which we might want to consider for future) I suggest changing what the attributes table presents to match what in fact folks recorded in the attributes.csv
as such:
- file name: to show which file the variable is found in
- variable name: the variable name
- description: the description
- units: the unitText
Does any one have any objections? @amoeba @maelle @aurielfournier @khondula @cboettig If not, do you mind if I just make the change and push to master? (I've made small bug fixes and pushed already, sorry!)
Actually, doing a bit more digging on schema.org, I don't think above suggestion will be straight forward:
-
I can't find an obvious property for
fileName
so ignore that suggestion for now. I do however suggest runningdistinct()
on theattributes
table afterfileName
has been removed duringwrite_spice()
. -
Looking into the definition of
value
:
The value of the quantitative value or property value node.
For QuantitativeValue and MonetaryAmount, the recommended type for values is 'Number'.
For PropertyValue, it can be 'Text;', 'Number', 'Boolean', or 'StructuredValue'.
it seems that, if we were including the actual data, value
would be the actual data values. So I'm not sure whether it is useful being included as part of the attributes table in index.html.
So my updated proposal is for the attributes table to include only these columns:
- name: the variable name
- description: the variable description
- units: the unitText
Any feedback would be really appreciated!
This what my example is being parsed as currently:
and units are not properly parsed either:
Accepting changes in PR #67, the example would look like this:
and units are properly parsed:
๐
Yeah, I think https://schema.org/value is a bit of weird beast. I believe it's basically it's just the "column type," e.g. text string, number, boolean, dateTime, etc; which I guess can be a useful thing to tell readr
(or mapped to EML's data types). Could be useful but not crucial.
For unit names, looks like you're going with EML-style names? I guess that's good because it's a common standard, though in general I think unit names that could be parsed by units
package might be preferable? @amoeba thoughts?
Yeah, I think https://schema.org/value is a bit of weird beast. I believe it's basically it's just the "column type," e.g. text string, number, boolean, dateTime, etc; which I guess can be a useful thing to tell readr (or mapped to EML's data types). Could be useful but not crucial.
So, I thought that as well initially, but the more I looked at it, the more I was convinced that the definition in schema.org was not defining the values the value
field could take, but more the data type of any values in the value
field. In any case, as we cannot record data type as part of the dataspice
workflow, I feel it's safe to remove for now.
For unit names, looks like you're going with EML-style names? I guess that's good because it's a common standard, though in general I think unit names that could be parsed by units package might be preferable?
Ha, busted! For this example dataset, I just lifted the attributes table than went with the example NEON dataset which I assume is EML. Will change for the workshop to unit names that units
can parse. ๐
PS, tutorial is shaping up!
- tutorial: http://annakrystalli.me/dataspice-tutorial/
- intro slides: http://annakrystalli.me/dataspice-tutorial/slides.html#1
Still got a few loose end to tie up but will tweet out final versions later today and report back on how workshop went this afternoon. We can then add a link in the dataspice
repo.
Wish me luck!
Closed via #67
Yeah, I see what you mean with value
, looking at the examples under https://schema.org/PropertyValue , it seems to really be the literal value in the data cell, e.g.
{
"@type": "PropertyValue",
"name": "Wifi range",
"value": 30,
"unitCode": "FOT"
}
is the schema.org way of saying a this "wifi router has a range of 30 ft". So yeah, probably not what we want. The way it used in the earthcube example I sometimes crib from is also confusing, used in the context of describing a column rather than an individual value, but in that context seems like they are using it for what might ought to be name
instead, just as you point out. Anyway, seems safe to ignore for now.
The tutorial and slides look awesome! โจ Good luck & knock their socks off!