gdcc/pyDataverse

API fails to handle metadataBlocks: "astrophysics" & "biomedical"

Opened this issue ยท 5 comments

Bug report

The latest version of dataset-create-new-all-default-fields.json allows also "astrophysics" & "biomedical" related metadata to be populated for a newly created data set. But the present version of src/pyDataverse/models.py does not seem to be capable of handling these metadata blocks.

image

1. Describe your environment

  • OS: Linux, Ubuntu 22.04.1, 64bit
  • pyDataverse: 0.3.1
  • Python: 3.10.12
  • Dataverse: v. 6.0 build 1512-366fd41

2. Actual behaviour:

  • Create a new dataset with the "astrophysics" & "biomedical" metadata and notice that the metadata information is not populated in dataverse
    ds = Dataset()
    ds.set({"title": "Test Data Set"})
    ds.set({"license": "CC0 1.0"})
    ds.set({"astroType": ["Mosaic"],
            "astroFacility": ["AIK-2", "AIK-3"],
            "studyDesignType": ["Case Control", "Cross Sectional"],
            "studyAssayOrganism": ["Arabidopsis thaliana", "Bos taurus", "Zea mays"]})
    ds.validate_json()
   native_api.create_dataset(dataverse=dataverse_id, metadata=ds.json(), auth=True)
  • I have tried populating from different blocks, most of them succeeded except these two blocks
    image

3. Expected behaviour:

  • API should be able to populate also the metadata blocks: "astrophysics" & "biomedical"..

4. Steps to reproduce

  1. Create a new data-set using the method mentioned in the "expected behavior" above
  2. Check the created data set in dataverse to see that the "astrophysics" & "biomedical" metadata are missing..

5. Possible solution

  • As I mentioned above in the description model needs to be extended to populate other metadata blocks also.

I don't mean to speak for @JR-1991 but I believe this is supported by EasyDataverse: https://github.com/gdcc/easyDataverse

@pdurbin, thanks for raising this. The latest version of EasyDatavere now handles metadata configs dynamically, so you can easily get metadata schemes as classes and populate them like any other Python/PyDantic dataclass.

I've tested your use case and created a Jupyter Notebook that shows how to upload metadata to Demo Dataverse. Also, find a small documentation of other features in the flexible-connect branches Readme file.

Let me know if you have any questions ๐Ÿ˜Š

Colab notebook

image

I don't mean to speak for @JR-1991 but I believe this is supported by EasyDataverse: https://github.com/gdcc/easyDataverse

@pdurbin Thanks for the suggestion, does that mean that I have to switch to easyDataverse instead of pyDataverse? Also I guess, there won't be any future pyDataverse extension planned for this support?

@JR-1991 I will try this solution also, thanks for the detailed explanation with the snippets too ๐Ÿ‘

@jmurugan-fzj - For the time being, I suggest using EasyDataverse to work around the issue. However, in the future, you can continue to use PyDataverse since we are currently planning the next version of PyDataverse. This will address a multitude of issues and PRs filed within this repository.