NSAPH-Projects/space

Add checksum value to data sets

Opened this issue · 2 comments

In the SpaceEnv.download_data function, we only check for the existence of the directory, and if it's present, we don't proceed to download the data. If the file doesn't exist, eventually a 'file not found' error will be triggered. However, this approach won't detect if the files have been modified. It would be beneficial to incorporate checksum values for the data in the master file. When we need to validate the data, we can then compare these stored checksum values with the checksum of the current data.

@atrisovic
Does dataverse has some built-in functionality for checksums?

Yes it does. It's worth double checking if it's incorporated within the pyDV, if not it's not hard to implement it