The Connectivity Modeling System (CMS) has been developed to study complex larval migrations and give probability estimates of population connectivity. The CMS can also passively track virtual particles that passively advected by the velocity fields.
Since 2014, I have been using CMS to quantify Agulhas Leakage, by seeding particles in the Agulhas Current jet and track the number of particles that end up on the other side of the GoodHope line and the timing of such crossings. By summing up the crossing particles at each time step, we create a time series of Agulhas leakage. More details can be found in my recent paper.
As the coupled model keeps generating more outputs, running CMS like before becomes less practical. The CMS was not designed to track particles at such scale for such a long period of time (multiple decades). Also, I ran into some issues when I tried submitting CMS jobs to the UM cluster -- a continuous job cannot complete within the wall time and memory limits. So, I came up with a walk-around to divide the multi-decade-long job into several smaller chunks, ensuring that such jobs can complete successfully. Moreover, by doing that, I can easily extend the leakage time-series without repeating the previous years.
One day, folks from Center for Computational Science (CCS) told me that I was suspended from submitting more CMS jobs because such jobs drained the system memory and significantly dragged down the performance of the cluster. Some staffs helped me to test run CMS on an isolated filesystem, and we eventually identified that the vast part of memory usage was caused by outputting as NetCDF files. So they advised all CMS users on the cluster to set the output format to ASCII.
Changing output to ASCII reduces the required time for a 5-year chunk tracking 600 thousand particles from 12hrs to less than 30mins. However, that also renders my old post-processing scripts useless. This repo documented some of the changes I made.
gen_hrc07_release_chunks.py
andmultiple_gen.py
are used to generate releasefiles and their corresponding volume_tag files.multiple_gen.py
callsgen_hrc07_release_chunks.py
as a function to generate releasefiles in five-year chunks.changeending.py
can add.txt
to thetraj_file_xx
in theexpt_name/output
folder. Alternatively, one can modify the source code of CMS by adding//".txt"
tooutput.f90
line 55-57. This change allows the matlab function tabularTextDatastore (available after 2016a) to detect these ascii files.traj_proc_update.m
is the main program, calling two functionscms_ascii_postproc
anddailyload_core_voltag
.proc_sub
is a sample LSF job submit script.jul2greg.m
is used incms_ascii_postproc
to change the original releasedate to gregorian days (from internet).chunk_traj_proc.py
copies, renames and editstraj_proc_update.m
andproc_sub
for several chunks at once.