submitted and other changes on master for multiple PR processing
Closed this issue · 6 comments
For the current CI processing on limited computational resources, staggered processes may deploy to the same catalog.txt and processed/README.md without updated master branch on the CI clone of the master branch. This may be remedied by re-pulling at the deployment step.
Separately, having the in-CI-process files in submitted directory is causing errors: i.e. if a PR is being processed and a new PR is cloning the directory after CI intialization but before the first PR is merged, the submitted yaml of the previous PR will still be pulled down on the second submission. This problem will likely be resolved with improved resource allocations on the CI servers, which will reduce processing time and subsequently reduce the likelihood that staggered processes will occur. However, it is a bug that needs to be addressed.
The delayed processing time is the result of relatively stringent chunksize used to fit the CI environment. Local processing using cimr (and python in general) has dynamic memory allocation allowing larger chunksizes and faster processing without fatal memory errors.
One easy solution is to separate submitted, processing and processed directories.
It might be good to add a git pull around here:
https://github.com/greenelab/cimr-d/blob/master/.circleci/deploy.sh#L95
That way the opportunity for a race condition is dramatically reduced, though not eliminated.
yes, same thoughts
That seems to be a good idea. Let me do some tests first.