Need optimisation for chunk dimensions
texadactyl opened this issue · 4 comments
The current chunk strategy is not working optimally for turbo_seti.
Initially, rawspec v3 has been using uniformly: (1, 1, number_of_fine_channels).
This causes degradation in readers such as turbo_seti.
This was amended to be (Nds, 1, Nfpc)
where Ndsis the number of spectra per dump and Nfpc is the number of fine channels per coarse channel.
@david-macmahon from Seti BL slack:
Change the chunk size logic such that
- The frequency dimension = min(nfpc, 65536)
- The time dimension is max(Nds, 16)
Only concern is that for nfpc=2^20 and Nc=64, we will have 1024 "active" chunks for the high frequency resolution product. That might not play well with the chunk cache, so performance testing this idea would be critical. I think it would make the read side (E.g. turbo_seti) very happy.
Agreed. Much better, IMO. Getting started on work toewards tag v3.1.2. The only affected module is fbh5_open.c
.
cc: @lacker @mattlebofsky
fbh5_open.c state
Control chunking with
int USE_BLIMPY = 0; // 1 : use blimpy's algorithm; 0 : don't do that
Control caching with
int CACHING_TYPE = 1; // 0 : no caching ; 1 : computed caching specifications; 2 : default caching
chunking logic in fbh5_open.c:
cdims[0] = Nd; // number of spectra per dump
cdims[1] = 1;
cdims[2] = p_fb_hdr->nfpc; // number of fine channels per coarse channel
computed caching in force:
fcache_nslots = (cdims[0] * cdims[2]) + 1;
fcache_nbytes = (Nd * p_fbh5_ctx->tint_size) + 1; // tint_size byte size of one spectra