TheEconomist/covid-19-the-economist-global-excess-deaths-model

prediction matrix not saved in output-data folder

Closed this issue · 5 comments

I am looking to run the 3 excess deaths model scripts. I get an error at the third script when it tries to subset the prediction matrix against the export_covariates object. I gather it is to remove the Mumbai region from the prediction matrix however the export_covariates object and the pred_matrix are of different dimensions so it throws up an error.

I have tried cleaning the environment and rerunning the second script but it continues to return the same error.

pred_matrix <- pred_matrix[export_covariates$iso3c != "IND_Mumbai", ]

Error in pred_matrix[export_covariates$iso3c != "IND_Mumbai", ] :
(subscript) logical subscript too long

Thanks

Hi, are the two objects of different dimensionality when you first load them?

I would suggest restarting R, and running the export script up to line 24 and see if you get an error and let me know if so.

Note that if you have generated a new pred_matrix yourself you must ensure that you generate a corresponding export_covariates object as well.

Hi - I think I figured out the issue. When I was generating a new pred matrix it was being saved in the working directory rather than the /output-data/ directory. This was creating a mismatch between my new export_covariates object in the output-data directory and the old pred_matrix that I cloned from the repo.

The problem is in the model script at line 321:
saveRDS(pred_matrix, "pred_matrix.RDS")`

While the third (export) script at lines 16 to 20 reads:

Load all model prediction + 101 bootstrap
pred_matrix <- readRDS("output-data/pred_matrix.RDS")

Load covariates (iso3c, country name, population ++)
export_covariates <- readRDS("output-data/export_covariates.RDS")

When I moved the new pred_matrix.RDS to the output-data directory, it seemed to address the issue. I am running through the three again just to be certain, however it takes a significant amount of time to run on my laptop.

Hi again - sorry for the multiple messages. I ran through the scripts again with line 321 modified to save the prediction matrix in the save folder as the other model output data and that seems to address the issue with the third script.

Would recommend updating the model script line 321 to the following so that others trying to replicate your analysis with updated data do not run into the same issue.

saveRDS(pred_matrix, "output-data/pred_matrix.RDS")

Thank you for the opportunity to have a closer look at some really cool work.

Nick

Thanks Nick, that is a good idea. I have made the suggested change, which should ease model regenerations. Thanks for the suggestions.