enram/data-repository

Rethink/fix coverage calendar-heatmap

Closed this issue · 6 comments

Screenshot 2022-04-27 at 17 16 26

The above calendar heatmap shows the coverage for a radar, but has some issues:

  • It is only displayed for radar years
  • It assumes a fixed 96 files per day (i.e. once every 15 minutes). Many radars have a different interval, making the color a bit meaningless
  • The visualization is currently cut off at the top (looks even worse on the new layout I am working on)
  • It relies on a coverage.csv file in the repository, that has to be maintained for that purpose
  • It makes use of pretty outdated JS code
  • @niconoe We will have to update the code anyone once we switch to the bucket with the new directory structure

@baptischmi @CeciliaNilsson709 @bart1 ... do you often use this coverage chart?

  • If so, just to check gaps in the data (more a TRUE/FALSE per day, rather than the number of files per day)?
  • Are there representations that would be better? E.g. showing the number of files for the current directory?
bart1 commented

I do find it a good feature, and it does help selecting data. Most of what I look at is true false, however sometimes I do also look at changes in frequency (change in color over time) of the data of regularity of the data (is there a lot of variation between days). If you think about expanding it for me it would be useful at the higher levels (e.g. radar or country), at the lower levels (month/week) I think i would use it less.

Thanks @bart1. I'm thinking to include a separate coverage page, where radar coverage can be compared across radars. E.g. time on the x axis, radars on the y axis, blocks to indicate if files were found (per month). The visualization could have anchors, so the user can jump to his/her radar of choice from the browse functionality. See suggestion in enram/aloftdata.eu#11.

I agree with Bart that it is quite useful to be able to see directly while browsing what kind of file density there is (especially to see gaps and changes). And lower levels wouldn't really make sense since you do see days, and that's definitely enough for this kind of quick overview. It's a good tool for deciding what to download without having to look through everything. Having to go somewhere else to look it up seems worse, although having such an overview would be nice in other situations.

But that being said its not critical to keep if it's hard to maintain.

bart1 commented

@peterdesmet comparing across radars is definitely very useful as it makes it possible to find time periods where data for the analysis of interest are available. It sounds like a great feature. It would be great to sort radars or countries by spatial proximity.

A late response from my side. true/false information is most useful, indeed. As mention above, a multi-year and by country per month (rather than week) can be useful. having different time resolution could be solved using two different colour scales (blue tones for 15 min, red for 5 min) or simply adding a meta information on the median time resolution between hf5-files?

Thanks for the feedback! The coverage is not going visualized on the website (as part of the browse) at the moment, but the data repository includes a simpler coverage.csv that lists all the files per directory:

baltrad/daily/bejab/2023,17
baltrad/hdf5/bejab/2018/05/18,47
baltrad/hdf5/bejab/2018/05/22,167
baltrad/hdf5/bejab/2018/05/23,229
baltrad/hdf5/bejab/2018/05/24,227
baltrad/hdf5/bejab/2018/05/25,215
baltrad/hdf5/bejab/2018/05/26,192
...

That should allow you to make your own coverage. I'll create an issue in the bioRad repository suggesting a function for this.