desihub/nightwatch

clean up redundant nightwatch files at NERSC

sybenzvi opened this issue · 3 comments

Nightwatch files at NERSC are written to $CFS/desi/spectro/nightwatch/nersc, but there are some redundant files in the subfolder $CFS/desi/spectro/nightwatch/nersc/perlmutter from a period last spring when @marcelo-alvarez was testing new scron jobs. The subfolder contains 71TB of files, some of which are linked to the top-level folder and some are not. Time to do some fall cleaning.

@sybenzvi wow that's a lot of redundancy - it was surprising to learn that nightwatch stores nearly 1 TB a night.

I suggest you remove all YYYYMMDD-named directories in

$CFS/desi/spectro/nightwatch/nersc/perlmutter

except for

20221221
20221222
20221223
20221224
20221225
20221226
20221227
20221228
20221229
20221230
20221231
20230101
20230102
20230103
20230104

since these are the linked (non-redundant) directories. Alternatively, you could move those directories up, although it might be nice to retain the information about the provenance of the contained data via the symbolic link.

If this sounds good to you, please go ahead, thanks.

@sybenzvi wow that's a lot of redundancy - it was surprising to learn that nightwatch stores nearly 1 TB a night.

I agree! I thought I had cleaned things up but probably we kept running two systems in parallel until shortly before cori went offline, and then I never went back to catch straggling files. I'll clean the folder and close the ticket.

Symbolic links to files in $CFS/desi/spectro/nightwatch/nersc/perlmutter were copied to the parent directory and the perlmutter folder was deleted. Will set up some additional cleanup of intermediate files but meanwhile we can close this ticket.