insightsengineering/random.cdisc.data

Install async update of cache

Closed this issue · 4 comments

This should be relatively easy to do. It would avoid non-consequential changes to untouched-source data. To consider if we should/can keep also the time stamp async. What do you think @edelarua @shajoezhu ?

Hi @Melkiades , I just saw this issue. Can you provide a bit more context for this issue?

Hi @Melkiades , I just saw this issue. Can you provide a bit more context for this issue?

As it is now, every time we cache it, we rerun everything, independently if there are modifications to the source files. I would add a check to see if there is a diff in the source files, then reload the cache selectively. I would use something like git diff --name-only

@Melkiades one thing to note - some datasets affect others (i.e. a change in radsl.R will cause changes in most/all datasets) which will have to be taken into account when determining which datasets to update.

exactly what Emily said. I think we need to use something to catpure the data set dependencies as well.