dataspice - Create lightweight schema.org descriptions of Data

Question

dataspice - Create lightweight schema.org descriptions of Data

amoeba opened this issue 4 years ago · 29 comments

Submitting Author: Bryce Mecum (@amoeba)
Other Authors: Carl Boetigger (@cboettig), Scott Chamberlaim (@sckott), Auriel Fournier (@aurielfournier), Kelly Hondula (@khondula), Anna Krystalli (@annakrystalli), Maëlle Salmon (@maelle), Kate Webbink (@magpiedin), Kara Woo (@karawoo)
Repository: https://github.com/ropenscilabs/dataspice/
Version submitted: 1.0.0
Editor: @emilyriederer
Reviewers: @tdjames1, @aebratt

Due date for @tdjames1: 2021-03-21

Due date for @aebratt: 2021-03-21
Archive: TBD
Version accepted: TBD

Paste the full DESCRIPTION file inside a code block below:

Package: dataspice
Version: 1.0.0
Title: Create Lightweight Schema.org Descriptions of Data
Description: The goal of 'dataspice' is to make it easier for researchers to
  create basic, lightweight, and concise metadata files for their datasets.
  These basic files can then be used to make useful information available during
  analysis, create a helpful dataset "README" webpage, and produce more complex
  metadata formats to aid dataset discovery. Metadata fields are based on
  the 'Schema.org' and 'Ecological Metadata Language' standards.
Authors@R: c(
    person("Carl", "Boettiger", role = c("aut"), comment = "https://github.com/cboettig"),
    person("Scott", "Chamberlain", role = c("aut"), comment = "https://github.com/sckott"),
    person("Auriel", "Fournier", role = c("aut"), comment = "https://github.com/aurielfournier"),
    person("Kelly", "Hondula", role = c("aut"), comment = "https://github.com/khondula"),
    person("Anna", "Krystalli", role = c("aut"), comment = "https://github.com/annakrystalli"),
    person("Bryce", "Mecum", role = c("aut", "cre"), email = "brycemecum@gmail.com", comment = "https://github.com/amoeba"),
    person("Maëlle", "Salmon", role = c("aut"), comment = "https://github.com/maelle"),
    person("Kate", "Webbink", role = c("aut"), comment = "https://github.com/magpiedin"),
    person("Kara", "Woo", role = c("aut"), comment = "https://github.com/karawoo"),
    person("Irene", "Steves", role = c("ctb"), comment = "https://github.com/isteves"))
License: MIT + file LICENSE
URL: https://github.com/ropenscilabs/dataspice
BugReports: https://github.com/ropenscilabs/dataspice/issues
Encoding: UTF-8
LazyData: true
ByteCompile: true
RoxygenNote: 7.1.1
Imports:
    purrr,
    EML,
    fs,
    jsonlite,
    whisker,
    readr,
    stringr,
    tools,
    tibble,
    shiny,
    rhandsontable,
    dplyr,
    tidyr,
    ggplot2,
    magrittr
Suggests:
    testthat,
    kableExtra,
    knitr,
    rmarkdown,
    servr,
    listviewer,
    maps
VignetteBuilder: knitr
Roxygen: list(markdown = TRUE)

Scope

Please indicate which category or categories from our package fit policies this package falls under: (Please check an appropriate box below. If you are unsure, we suggest you make a pre-submission inquiry.):
- data retrieval
- data extraction
- data munging
- data deposition
- workflow automation
- version control
- citation management and bibliometrics
- scientific software wrappers
- field and lab reproducibility tools
- database software bindings
- geospatial data
- text analysis
Explain how and why the package falls under these categories (briefly, 1-2 sentences):

dataspice helps create metadata for deposit/publication of data
Who is the target audience and what are scientific applications of this package?

The target audience is people producing data of all types, scientists and otherwise. Documenting data has a multitude of scientific applications including data publication, data sharing, data integration, etc.
Are there other R packages that accomplish the same thing? If so, how does yours differ or meet our criteria for best-in-category?

No other packages do the same thing (to my knowledge). dataspice is somewhat related to EML, similar to codebook and EMLAssemblyLine.
(If applicable) Does your package comply with our guidance around Ethics, Data Privacy and Human Subjects Research?

I don't think this is applicable to our package.
If you made a pre-submission enquiry, please paste the link to the corresponding issue, forum post, or other discussion, or @tag the editor you contacted.

@maelle and I have talked about this on ropensci Slack

Technical checks

Confirm each of the following by checking the box.

I have read the guide for authors and rOpenSci packaging guide.

This package:

does not violate the Terms of Service of any service it interacts with.
has a CRAN and OSI accepted license.
contains a README with instructions for installing the development version.
includes documentation with examples for all functions, created with roxygen2.
contains a vignette with examples of its essential functions and uses.
has a test suite.
has continuous integration, including reporting of test coverage using services such as Travis CI, Coveralls and/or CodeCov.

Publication options

Do you intend for this package to go on CRAN?
Do you intend for this package to go on Bioconductor?
Do you wish to submit an Applications Article about your package to Methods in Ecology and Evolution? If so:

MEE Options

The package is novel and will be of interest to the broad readership of the journal.
The manuscript describing the package is no longer than 3000 words.
You intend to archive the code for the package in a long-term repository which meets the requirements of the journal (see MEE's Policy on Publishing Code)
(Scope: Do consider MEE's Aims and Scope for your manuscript. We make no guarantee that your manuscript will be within MEE scope.)
(Although not required, we strongly recommend having a full manuscript prepared when you submit here.)
(Please do not submit your package separately to Methods in Ecology and Evolution)

Code of conduct

I agree to abide by rOpenSci's Code of Conduct during the review process and in maintaining my package should it be accepted.

Answer 1 · 2021-02-09T00:57:48.000Z

Thank you for the submission. Due to conflicts of interests with the current Associate Editors, we need to find a Guest Editor for this package. Please, bear with me as I find a person to handle your submission. You will be notified as soon as someone is assigned.

Answer 2 · 2021-02-17T18:54:25.000Z

Very happy to share that your Guest Handling Editor is @emilyriederer

She will be completing the automated checks soon.

Answer 3 · 2021-02-19T00:58:43.000Z

Editor checks:

Documentation: The package has sufficient documentation available online (README, pkgdown docs) to allow for an assessment of functionality and scope without installing the package.
Fit: The package meets criteria for fit and overlap
Automated tests: Package has a testing suite and is tested via Travis-CI or another CI service.
License: The package has a CRAN or OSI accepted license
Repository: The repository link resolves correctly

Editor comments

Hi Team! Thank you so much for your submission. I have run the preliminary checks and overall everything looks good. In summary, below I highlight a few small spelling mistakes, one styling issue flagged by goodpractice, and one question about the current testing plan. Please let me know what you think.

Spelling

The following may be spelling mistakes. I see "shinyapp" and "Shinyapps" are consistently used as single words, though, so I am open to that being a style choice.

> spelling::spell_check_package()
DESCRIPTION does not contain 'Language' field. Defaulting to 'en-US'.
  WORD                    FOUND IN
rectagular              populating-templates.Rmd:22
shinyapp                edit_access.Rd:14
                        edit_attributes.Rd:14
                        edit_biblio.Rd:14
                        edit_creators.Rd:14
Shinyapps               edit_access.Rd:5
                        edit_attributes.Rd:5
                        edit_biblio.Rd:5
                        edit_creators.Rd:5
worfklowdiagram         README.md:54
                        README.Rmd:56

Styling

goodpractice highlights many cases where lines longer than the recommended 80 characters:

> goodpractice::gp()

R/build_site.R
R/edit_file.R
R/eml_crosswalk.R
R/eml_to_spice.R
R/jsonld_to_mustache.R
R/prep.R
R/prep_access.R
R/serve_site.R
R/spice_to_eml.R
R/validate_metadata.R
R/write_spice.R
tests/testthat/setup.R
tests/testthat/test-crosswalks.R
tests/testthat/test-eml_to_spice.R
tests/testthat/test-prep_attribute.R
tests/testthat/test-write_spice.R

Continuous integration

I see your testing R CMD CHECK across multiple OS on GitHub Actions. Do you plan to run your testthat unit tests here as well? The don't seem to be integrated currently.

Reviewers:
Due date:

Answer 4 · 2021-02-19T01:05:49.000Z

Hi @emilyriederer, thanks for the prompt review! I'll get back to you before the start of next week with fixes and responses.

Answer 5 · 2021-02-22T02:20:48.000Z

Hey again @emilyriederer: I've had a chance to go over your review and didn't run into any problems. I'll address your feedback in the same structure you provided it:

Spelling: All fixed. I went with "Shiny app" everywhere as that seems to be the preferred nomenclature.
Styling: I mostly was able to address these but stopped short of editing some of the lines that crossed the 80 character mark (though I fixed many). The ones I didn't change seemed to me like they'd become less readable if I did any reformatting, e.g., https://github.com/ropenscilabs/dataspice/blob/1adb1a25cf19b5caee8ecd4f5e1dc3971e9a2a48/R/edit_file.R#L94-L96. Please let me know you think this is acceptable.
Continuous integration: It's my understanding that the GH Action we run does run the tests when it runs R CMD CHECK. I double-checked by checking the logs of the latest run and I see:
```
* checking tests ...
  Running ‘testthat.R’
  OK
```
Do you agree with my take?

Thanks again for the review -- the changes solidly tightened up some parts of the package. Let me know if you have any questions. My changes are pushed to the new main branch (was master) and can be seen at https://github.com/ropenscilabs/dataspice/tree/1adb1a25cf19b5caee8ecd4f5e1dc3971e9a2a48.

Answer 6 · 2021-02-24T00:19:20.000Z

Thanks for the replies, @amoeba ! That all makes sense and looks good to me. My apologies on the last issue; I checked the log for the CI but somehow missed where the tests were getting called. I'm on board now. Have you all considered using covr to show test coverage statistics in the repo also?

I've begun the search for reviewers and have confirmed @aebratt as the first reviewer. I will hopefully confirm the second shortly.

Answer 7 · 2021-02-24T00:37:05.000Z

Have you all considered using covr to show test coverage statistics in the repo also?

Not as a group but I had thought about it in light of onboarding and Continuous Integration Best Practices. Might be good to set up a separate test workflow and loop in covr. If we go that route, should I aim to have that working as part of onboarding or could it be done after?

Answer 8 · 2021-02-24T00:51:07.000Z

I think it's definitely a good thing to have possibly before final onboarding, but it's a hard requirement! I'll continue the search for reviewer and we'll certainly keep the ball rolling.

Answer 9 · 2021-02-28T12:33:57.000Z

@ropensci-review-bot add @tdjames1 to reviewers

Answer 10 · 2021-02-28T12:34:25.000Z

@tdjames1 added to the reviewers list. Review due date is 2021-03-21. Thanks @tdjames1 for accepting to review! Please refer to our reviewer guide.

Answer 11 · 2021-02-28T12:37:06.000Z

@ropensci-review-bot add @aebratt to reviewers

Answer 12 · 2021-02-28T12:37:08.000Z

@aebratt added to the reviewers list. Review due date is 2021-03-21. Thanks @aebratt for accepting to review! Please refer to our reviewer guide.

Answer 13 · 2021-03-12T21:29:20.000Z

Package Review

Please check off boxes as applicable, and elaborate in comments below. Your review is not limited to these topics, as described in the reviewer guide

Briefly describe any working relationship you have (had) with the package authors.
- NA
As the reviewer I confirm that there are no conflicts of interest for me to review this work (If you are unsure whether you are in conflict, please speak to your editor before starting your review).

Documentation

The package includes all the following forms of documentation:

A statement of need clearly stating problems the software is designed to solve and its target audience in README
- Looks great. I think this is a compelling package with a lot of use cases.
Installation instructions: for the development version of package and any non-standard dependencies in README
- Installed well on my machine.
Vignette(s) demonstrating major functionality that runs successfully locally
- Vignette runs locally and demonstrates some functionality. However, it does not include
  a description of the edit_xx functions which open a Shiny app for editing metadata attributes.
  In my opinion this is a nice piece of functionality in this package.
Function Documentation: for all exported functions
- Looks good.
Examples (that run successfully locally) for all exported functions
- This is a nit-pick but I think that calling the example here fully-worked is somewhat misleading, since the csv editing is happening outside of R, for most. I think that could be made clearer.
- I also think that this tutorial by one of the authors is excellent, but it's not clearly linked from other documentation.
Community guidelines including contribution guidelines in the README or CONTRIBUTING, and DESCRIPTION with URL, BugReports and Maintainer (which may be autogenerated via Authors@R).
- Looks good.

Functionality

Installation: Installation succeeds as documented.
- Installed well.
Functionality: Any functional claims of the software been confirmed.
- I explored the package with a few of my own use-cases without any issues.
Performance: Any performance claims of the software been confirmed.
- Again, I explored the package with a few of my own use-cases without any issues.
Automated tests: Unit tests cover essential functions of the package
and a reasonable range of inputs and conditions. All tests pass on the local machine.
- Was able to pass tests but I agree with the above discussion that using covr to show test coverage statistics within the repository would be nice to have. Has there been progress made on that front?
Packaging guidelines: The package conforms to the rOpenSci packaging guidelines
- Looks good.

Estimated hours spent reviewing: 4 hours, much of which was playing with use cases.

Should the author(s) deem it appropriate, I agree to be acknowledged as a package reviewer ("rev" role) in the package DESCRIPTION file.

Review Comments

This is a really slick package that is well-done and widely applicable. Nice work, team! My primary suggestion would be to standardize the documentation so that the pacakage is easy to learn from multiple points of entry (see suggestions about README, Example, Vignette, and Tutorial above). In this documentation I would like to see the Shiny functionality reiterated. I am also in support of using covr. Additionally, I noted a couple of small style notes below:

Style

I am on board with keeping the few remaining long lines long for stylistic reasons. I did notice
what I think is a typo:

spatialCoverge          parse_GeoShape_box.Rd:5,16
                        parse_spatialCoverage.Rd:5,16

and I am guessing that this should say "spatialCoverage". It is just in the documentation though.

I also noticed that the pacakge imports spread from tidyR which I believe was deprecated by pivod_wider. Is that intentional?

Answer 14 · 2021-03-15T00:03:56.000Z

Thanks so much for the review, @aebratt !

Answer 15 · 2021-03-21T23:36:54.000Z

Package Review

Please check off boxes as applicable, and elaborate in comments below. Your review is not limited to these topics, as described in the reviewer guide

Briefly describe any working relationship you have (had) with the package authors.
- Scott Chamberlain: Co-member of sAPROPOS working group, Feb 2017
- Anna Krystalli: Co-organiser of Sheffield R Users Group
As the reviewer I confirm that there are no conflicts of interest for me to review this work.

Documentation

The package includes all the following forms of documentation:

A statement of need clearly stating problems the software is designed to solve and its target audience in README
Installation instructions: for the development version of package and any non-standard dependencies in README
Vignette(s) demonstrating major functionality that runs successfully locally
Function Documentation: for all exported functions
Examples (that run successfully locally) for all exported functions
Community guidelines including contribution guidelines in the README or CONTRIBUTING, and DESCRIPTION with URL, BugReports and Maintainer (which may be autogenerated via Authors@R).

For packages co-submitting to JOSS

The package has an obvious research application according to JOSS's definition

The package contains a paper.md matching JOSS's requirements with:

A short summary describing the high-level functionality of the software

Authors: A list of authors with their affiliations

A statement of need clearly stating problems the software is designed to solve and its target audience.

References: with DOIs for all those that have one (e.g. papers, datasets, software).

Functionality

Installation: Installation succeeds as documented.
Functionality: Any functional claims of the software been confirmed.
Performance: Any performance claims of the software been confirmed.
Automated tests: Unit tests cover essential functions of the package
and a reasonable range of inputs and conditions. All tests pass on the local machine.
Packaging guidelines: The package conforms to the rOpenSci packaging guidelines

Estimated hours spent reviewing: 6

Should the author(s) deem it appropriate, I agree to be acknowledged as a package reviewer ("rev" role) in the package DESCRIPTION file.

Review Comments

This is a neat, self-contained package that makes it easy for dataset owners to create standardised metadata to improve visibility and usability of their data.

Package installation and checks

Installation and package checks ran smoothly except size reduction checks which produced a warning about qpdf - I do have this installed so I'm not sure why these checks were not carried out.

goodpractice reported several problems with long lines, relating to (1) strings providing help text for the Shiny app and (2) test invocations of the build_site() function. For (1), you could use the flexibility in the ... argument to the Shiny HTML builder functions which results in unnamed arguments becoming children of the element, e.g.:

shiny::h6("fileName: The name of the input data file(s). ",
          "Don't change this.")

For (2), you could create an intermediate variable to hold the file path or break the line before the package argument.

goodpractice also flagged up calls to setwd() in test functions. However, the authors have already followed the advice given by goodpractice about "using on.exit() to restore the working directory."

Package metadata

devtools::spell_check() complains about missing Language field in DESCRIPTION.
Suggest fixing misspelling of spatialCoverage in documentation for parse_GeoShape_box and parse_spatialCoverage functions.

Documentation

README

In online documentation (https://docs.ropensci.org/dataspice), link to "metadata standards" opens up https://github.com/ropenscilabs/dataspice#resources rather than linking to https://docs.ropensci.org/dataspice/#resources.
"An example of how Google sees this can be found here." leads to a page which contains a banner stating that the structured data testing tool is being deprecated. Can you find an alternative demonstration of validating the structured data?
Convert to EML section: the output produced by spice_to_eml(spice) is very wide and not easy to read, is there any way to resolve this via knitr options? e.g. https://bookdown.org/yihui/rmarkdown-cookbook/text-width.html
There is also a missing space following a full stop in the output: "produced.You".

Help files

No top-level documentation ?dataspice. Consider usethis::use_package_doc() to add a template for the top-level documentation.
validate_file_path() help text is headed "Prepare attributes" presumably due to shared content - it's a bit confusing. Also no example usage for this function in the example.
prepare_access() - no example given.
eml_to_spice() - typo in example "disd".
write_spice() - no example given.

Shiny app

'The units the variable was measured in.' -> 'The units in which the variable was measured.'

Vignettes

Creating Schema.org Metadata and README website with dataspice

Set width of figures to match page width.
Under Step 4, final sentence appears to be incomplete: "serve_site() ?"

Functionality

Worked well with provided datasets and other trial datasets. A couple of comments on the Shiny app user interface:

edit_attributes(): help text for fileName, varableName is marked 'Do not change' - is it possible to enforce this by setting the columns to read only?

edit_biblio(): the list of metadata covered is rather long, which makes it hard to read the help text while filling in the table. Some possible solutions: offer help text as onhover or onclick of table headings; split up table into chunks to match the metadata headings.

Code

Source code is well-formulated, clear and free from duplication.

Some file naming could be adjusted to match function naming e.g. prep.R -> prep_attributes.R.

Testing

Test coverage is a bit low and could be improved by providing some missing tests e.g. prep_access(), validate_metadata().

Also, rOpenSci Packaging guide suggests testing shiny app components: "Packages with shiny apps should use a unit-testing framework such as shinytest to test that interactive interfaces behave as expected."

Answer 16 · 2021-03-22T01:31:02.000Z

Thank you @aebratt and @tdjames1 for the detailed reviews and helpful feedback! @amoeba , the ball is back in your court now! In the next few weeks, please take some time to consider the feedback, engage with the reviewers, and make any relevant changes.

In the meantime, I have a few more administrative favors to ask of you:

Please add a badge to your README denoting that the package is under rOpenSci review. You may do this with the rodev::use_review_badge(426) command or by directly pasting the following in your MD file:

[![](https://badges.ropensci.org/426_status.svg)](https://github.com/ropensci/software-review/issues/426)

Please also add a NEWS.md document to your repo also

Thanks again to all!

Answer 17 · 2021-03-22T19:12:33.000Z

Thanks @emilyriederer, @aebratt, and @tdjames1. I'll get on your two items above, @emilyriederer, and plan to work through the reviews over the next two weeks.

Answer 18 · 2021-04-12T03:05:13.000Z

Hey again @emilyriederer, @aebratt, and @tdjames1. I'm a bit late in getting to all of your feedback but have carved out some time tonight to start. So far I've collected all of the feedback in order to see how much time it might take me and I think I can get everything addressed within the week. Apologies for the delay!

Answer 19 · 2021-04-16T02:22:23.000Z

Don't worry at all, @amoeba ! I completely understand. Thanks so much for the heads up, and we'll just follow your lead on the timeline.

Answer 20 · 2021-04-25T11:12:27.000Z

Hey @amoeba ! Hope all is well. I just thought I'd check in and see if you have a timeline in mind for the next steps? No pressure -- I know there's a lot going on these days! -- but thought it could be good to get a sense from you. Thanks!

Answer 21 · 2021-04-29T06:22:26.000Z

Hey @emilyriederer, apologies for the delay. I've gone over both reviews and made changes and am ready to go over my changes with the reviewers.

Thanks @aebratt and @tdjames1 for the very thorough dives into the package and all of the suggestions. Just about every suggestion or idea was reasonable and I've made changes in a branch at https://github.com/ropenscilabs/dataspice/tree/onboarding-comments.

A full item-by-item response to all of your comments is at the bottom under "Full comments" but they're mostly me agreeing with you and making the change. I've included it in the hope that it makes it easier for you both to verify I've addressed your comments.

For all of your comments that required a conversation, I've put them directly below under individual headers. Please have a look and let me know what you think.

Code Coverage / shinytest (@emilyriederer, @aebratt , @tdjames1 )

This has come up a few times during review and I've done my best to bump up coverage. The package is now at ~72%

> covr::package_coverage()
dataspice Coverage: 72.17%
R/serve_site.R: 0.00%
R/edit_file.R: 0.47%
R/prep_attributes.R: 91.89%
R/prep_access.R: 94.74%
R/eml_to_spice.R: 98.08%
R/write_spice.R: 98.28%
R/eml_crosswalk.R: 98.57%
R/build_site.R: 100.00%
R/create_spice.R: 100.00%
R/jsonld_to_mustache.R: 100.00%
R/Person.R: 100.00%
R/PropertyValue.R: 100.00%
R/spice_to_eml.R: 100.00%
R/validate_metadata.R: 100.00%
R/write_jsonld.R: 100.00%

edit_file.R (mostly) and serve_site.R are dragging the average down. edit_file.R would require some work to get high coverage. @emilyriederer: Is this is an acceptable level of coverage given the nature of the code?

use of `tidyr::spread` (@aebratt)

I think we used this just because the pivot_* functions weren't around. spread is marked as superceded. I've filed an issue for the future at ropensci/dataspice#101 but I'd like to keep it as is for now. Sound good?

spice_to_eml output width (@tdjames1)

I wasn't able to find anything that quite did the trick so I'm inclined to leave it as is. Is that alright?

edit_biblio() width (@tdjames)

edit_biblio(): the list of metadata covered is rather long, which makes it hard to read the help text while filling in the table. Some possible solutions: offer help text as onhover or onclick of table headings; split up table into chunks to match the metadata headings.

I'm inclined to leave it as is but have filed an issue to track it. Sound good?

Hopefully I've addressed all of your comments @aebratt and @tdjames1. Let me know if that's not the case. I'm looking forward to hearing what you both think and thanks again for the thorough reviews. This has been super helpful so far.

Full comments

@aebratt review

Vignette runs locally and demonstrates some functionality. However, it does not include
a description of the edit_xx functions which open a Shiny app for editing metadata attributes.
In my opinion this is a nice piece of functionality in this package.

Good catch. I think this vignette was written before the Shiny apps were made and we didn't update the vignette. Fixed.

This is a nit-pick but I think that calling the example here fully-worked is somewhat misleading, since the csv editing is happening outside of R, for most. I think that could be made clearer.

I agree with your take so I've updated that section to be more clear about what is being shown.
We use the phrasing "An basic example repository" now instead.

I also think that this tutorial by one of the authors is excellent, but it's not clearly linked from other documentation.

Right, I'm not sure I had seen that and I've linked it in the Example section.

Was able to pass tests but I agree with the above discussion that using covr to show test coverage statistics within the repository would be nice to have. Has there been progress made on that front?

I've gone through and bumped up test coverage for the package as a whole to ~80% and all files that really can be tested to over 90%.

In this documentation I would like to see the Shiny functionality reiterated. I am also in support of using covr.

Additionally, I noted a couple of small style notes below:

spatialCoverge typo

Fixed.

I also noticed that the pacakge imports spread from tidyR which I believe was deprecated by pivod_wider. Is that intentional?

Just timing. I'm not sure the pivot_* were around then? I'll file an issue to get converted over post-review. spread isn't deprecated but "superceded".

@tdjames1' review:

goodpractice reported several problems with long lines

I've fixed all of these.

devtools::spell_check() complains about missing Language field in DESCRIPTION.

Fixed.

Suggest fixing misspelling of spatialCoverage in documentation for parse_GeoShape_box and parse_spatialCoverage functions.

Fixed.

In online documentation (https://docs.ropensci.org/dataspice), link to "metadata standards" opens up https://github.com/ropenscilabs/dataspice#resources rather than linking to https://docs.ropensci.org/dataspice/#resources.

Fixed with a relative link so the link works everywhere.

"An example of how Google sees this can be found here." leads to a page which contains a banner stating that the structured data testing tool is being deprecated. Can you find an alternative demonstration of validating the structured data?

Fixed. Google has a new tool and I've updated the link.

Convert to EML section: the output produced by spice_to_eml(spice) is very wide and not easy to read, is there any way to resolve this via knitr options? e.g. https://bookdown.org/yihui/rmarkdown-cookbook/text-width.html

I tried options(width...) and a few other things and wasn't able to come up with something. I'm inclined to leave things as is since the messages appear fine in R. What do you think?

There is also a missing space following a full stop in the output: "produced.You".

Fixed.

No top-level documentation ?dataspice. Consider usethis::use_package_doc() to add a template for the top-level documentation.

Nice trick! Done and thanks!

validate_file_path() help text is headed "Prepare attributes" presumably due to shared content - it's a bit confusing. > Also no example usage for this function in the example.

Fixed the incorrect title and set up a proper documentation page.

prepare_access() - no example given.

Fixed.

eml_to_spice() - typo in example "disd".

Fixed.

write_spice() - no example given.

Fixed.

Shiny app: 'The units the variable was measured in.' -> 'The units in which the variable was measured.'

Fixed.

Vignettes:
Set width of figures to match page width.

Fixed.

Under Step 4, final sentence appears to be incomplete: "serve_site() ?"

Fixed by removing it. That must have been a question the author had and never resolved.

edit_attributes(): help text for fileName, varableName is marked 'Do not change' - is it possible to enforce this by setting the columns to read only?

This is a great idea. All I can find is a way to make these tables read-only in their entirety. I'll file an issue so we can look at this for the future.

edit_biblio(): the list of metadata covered is rather long, which makes it hard to read the help text while filling in the table. Some possible solutions: offer help text as onhover or onclick of table headings; split up table into chunks to match the metadata headings.

Totally agree. This might take some deep digging so I'll file an issue for us to track it. Thanks.

Code: Some file naming could be adjusted to match function naming e.g. prep.R -> prep_attributes.R.

Good point, changed.

Test coverage is a bit low and could be improved by providing some missing tests e.g. prep_access(), validate_metadata().

I've gone through and bumped up test coverage for the package as a whole to ~80% and all files that really can be tested to over 90%.

Also, rOpenSci Packaging guide suggests testing shiny app components: "Packages with shiny apps should use a unit-testing framework such as shinytest to test that interactive interfaces behave as expected."

Good catch. I knew about shinytest but hadn't looked at how to use it. From what I can tell, it requires the Shiny apps under test to be written as stand-alone apps which ours aren't. Refactoring would take a bit of time for sure. We might ask @emilyriederer to weigh in on whether this is a strict requirement or not.

Answer 22 · 2021-04-29T23:15:30.000Z

Thanks for the update, @amoeba ! I also appreciate you're working to raise the test coverage. I think that should be good.

@aebratt and @tdjames1 , please let us know if you approve. If you do, please use the reviewer approval template

Answer 23 · 2021-05-06T18:01:51.000Z

Reviewer Response

Hi @amoeba, thanks for the thorough response to my review. I'm satisfied that you've resolved the points that I raised. I've responded below to the specific queries that you flagged up.

Also, I noticed a typo that was introduced in the overview vignette so I've made a pull request to fix that in your onboarding branch.

Code Coverage / shinytest (@emilyriederer, @aebratt , @tdjames1 )

This has come up a few times during review and I've done my best to bump up coverage. The package is now at ~72%

That seems pretty good to me.

spice_to_eml output width (@tdjames1)

I wasn't able to find anything that quite did the trick so I'm inclined to leave it as is. Is that alright?

Sure – it's only cosmetic so no problem to leave as is. Just one of those annoying things!

edit_biblio() width (@tdjames)

edit_biblio(): the list of metadata covered is rather long, which makes it hard to read the help text while filling in the table. Some possible solutions: offer help text as onhover or onclick of table headings; split up table into chunks to match the metadata headings.

I'm inclined to leave it as is but have filed an issue to track it. Sound good?

Sounds good.

edit_attributes(): help text for fileName, varableName is marked 'Do not change' - is it possible to enforce this by setting the columns to read only?

This is a great idea. All I can find is a way to make these tables read-only in their entirety. I'll file an issue so we can look at this for the future.

This should be possible according to the rhandsontable documentation:
"The individual columns, rows and cells can also be set to readOnly"
I'm noting this point here as I can't find the issue.

Final approval (post-review)

The author has responded to my review and made changes to my satisfaction. I recommend approving this package.

Estimated hours spent reviewing: 7

Answer 24 · 2021-05-14T21:37:45.000Z

Reviewer Response

Hi @amoeba. Apologies for the delay. Thanks for the thorough response to our reviews. I am quite happy with the changes you've made! In particular:

Code Coverage / shinytest (@emilyriederer, @aebratt , @tdjames1 )

This has come up a few times during review and I've done my best to bump up coverage. The package is now at ~72%

On board with this

use of tidyr::spread (@aebratt)

I think we used this just because the pivot_* functions weren't around. spread is marked as superceded. I've filed an issue for the future at ropenscilabs/dataspice#101 but I'd like to keep it as is for now. Sound good?

Sounds good!

Final approval (post-review)

The author has responded to my review and made changes to my satisfaction. I recommend approving this package.

Estimated hours spent reviewing: 5

Answer 25 · 2021-05-16T16:08:13.000Z

Approved! Thanks @amoeba for submitting and @aebratt @tdjames1 for your reviews! 😄

To-dos:

Transfer the repo to rOpenSci's "ropensci" GitHub organization under "Settings" in your repo. I have invited you to a team that should allow you to do so. You'll be made admin once you do.
Fix all links to the GitHub repo to point to the repo under the ropensci organization.
Delete your current code of conduct file if you had one since rOpenSci's default one will apply, see https://devguide.ropensci.org/collaboration.html#coc-file
If you already had a pkgdown website and are ok relying only on rOpenSci central docs building and branding,
- deactivate the automatic deployment you might have set up
- remove styling tweaks from your pkgdown config but keep that config file
- replace the whole current pkgdown website with a redirecting page
- replace your package docs URL with https://docs.ropensci.org/package_name
- In addition, in your DESCRIPTION file, include the docs link in the URL field alongside the link to the GitHub repository, e.g.: URL: https://docs.ropensci.org/foobar (website) https://github.com/ropensci/foobar
Fix any links in badges for CI and coverage to point to the ropensci URL. We no longer transfer Appveyor projects to ropensci Appveyor account so after transfer of your repo to rOpenSci's "ropensci" GitHub organization the badge should be [![AppVeyor Build Status](https://ci.appveyor.com/api/projects/status/github/ropensci/pkgname?branch=master&svg=true)](https://ci.appveyor.com/project/individualaccount/pkgname). If Appveyor does not pick up new commits after transfer, you might need to delete and re-create the Appveyor project. (Repo transfers are smoother with GitHub Actions)
Please check you updated the package version post-review version updated and that you documented all changes in NEWS.md
We're starting to roll out software metadata files to all ropensci packages via the Codemeta initiative, see https://github.com/ropensci/codemetar/#codemetar for how to include it in your package, after installing the package - should be easy as running codemetar::write_codemeta() in the root of your package.

Should you want to acknowledge your reviewers in your package DESCRIPTION, you can do so by making them "rev"-type contributors in the Authors@R field (with their consent). More info on this here.

Welcome aboard! We'd love to host a post about your package - either a short introduction to it with an example for a technical audience or a longer post with some narrative about its development or something you learned, and an example of its use for a broader readership. If you are interested, consult the blog guide, and tag @stefaniebutland in your reply. She will get in touch about timing and can answer any questions.

We've put together an online book with our best practice and tips, this chapter starts the 3d section that's about guidance for after onboarding. Please tell us what could be improved, the corresponding repo is here. Additionally, since you plan to submit to CRAN, please check out our CRAN gotchas

Answer 26 · 2021-05-16T18:55:02.000Z

@amoeba If you're willing, I think this would make a great rOpenSci blog post for several reasons. Our community call The Wild World of Data Repositories was our most popular ever. ~160 people from 19 countries on 5 continents attended and it was remarkable that in a poll of 133 attendees, 44% self-identified as being affiliated with a repository or infrastructure org; 56% not. Broad interest.

@sinarueeger wrote her perspective on the community call The headache with data repositories.

It's also remarkable that this started as an unconf18 project 🚀 and y'all have taken it this far.

Let me know what you think :-)

Answer 27 · 2021-05-16T21:14:25.000Z

Thank you both so much, @aebratt and @tdjames1, for your detailed reviews. I particularly appreciate the amount of time you invested and the level of detail at which you combed over the package. Would it be alright if we listed you both in the package description with the rev role code? If you have an ORCID, let me know what it is and I'll put it alongside your entries.

Hey @emilyriederer, thanks for handling this submission. I've gone ahead and transferred the package to https://github.com/ropensci and completed the rest of the changes on your checklist above. I think one or two final details might need taking care of but we should otherwise be good to go.

Re: a tech note or blog post, I have it in the back of my mind that I promised @stefaniebutland a blog post a while ago. I can't commit to one right now but I'll re-evaluate in a few weeks and hopefully I can carve out some time. I think a blog post is very good idea.

Answer 28 · 2021-05-16T21:49:08.000Z

Hi @amoeba, I'd be happy with being listed in the package description. My ORCID is 0000-0003-1363-4742. Thanks and good work getting the package approved!

Answer 29 · 2021-06-17T07:05:11.000Z

Thanks @tdjames1, we'll get you listed. Thanks again for the review.

Scope

Technical checks

Publication options

Code of conduct

Editor checks:

Editor comments

Spelling

Styling

Continuous integration

Package Review

Documentation

Functionality

Review Comments

Style

Package Review

Documentation

For packages co-submitting to JOSS

Functionality

Review Comments

Package installation and checks

Package metadata

Documentation

Functionality

Code

Testing

Code Coverage / shinytest (@emilyriederer, @aebratt , @tdjames1 )

use of tidyr::spread (@aebratt)

spice_to_eml output width (@tdjames1)

edit_biblio() width (@tdjames)

@aebratt review

@tdjames1' review:

Reviewer Response

Code Coverage / shinytest (@emilyriederer, @aebratt , @tdjames1 )

spice_to_eml output width (@tdjames1)

edit_biblio() width (@tdjames)

Final approval (post-review)

Reviewer Response

Code Coverage / shinytest (@emilyriederer, @aebratt , @tdjames1 )

use of tidyr::spread (@aebratt)

Final approval (post-review)

use of `tidyr::spread` (@aebratt)

use of `tidyr::spread` (@aebratt)