ICB-DCM/PESTO

improvement of sampling pipeline

JanHasenauer opened this issue · 4 comments

Sampling currently requires quite a bit of manual work, e.g. shortening of chains, ... . I think it would be great to have a routine which performs multiple sampling runs, automatically removes the burn-in and checks for reproducibility. This could be implemented in getParameterSamples.

Good idea. First step would be to include an analysis function, which takes a folder containing sampling results as an input and returns an analysis summary statistics / struct like object.

However, at this point I am not a fan of adding multiple runs as options for getParameterSamples as this can be done very easily using multiple calls of getParameterSamples and there is like no setting where this would be useful: Running multiple runs in parallel on a default laptop machine is usually not feasable as it is for optimizations. This said, running multiple runs in a server, grid-like structure usually demands a clean split between runs to allow the queueing system to distribute the jobs to certain nodes separately. Thus, in this scenario the option could not be used as well.

I agree that for large problems, one usually needs more computational resources. For smaller problems, this is however not the case. One application example would be Elba's model for hematopoiesis. I think it would be really nice to have a function getMultipleParameterSamples (similar to what we have for optimisation), which also performs the analysis. Even if persons cannot use the function write away in a server setting, they would be able to see what steps have to be taken. For many users this is not clear.

What do you think about an in-depth example covering such a functionality by calling multipe instances of getParameterSamples() followed by the application of the analysis pipeline? Such piece of code could be used by everyone interested in robust sampling assessment. With a function as getMultipleParameterSamples, I currently see the danger missleading unexperienced users.

An example is good, but in my opinion not sufficient. One never knows whether a user looks at all the examples.

What danger do you see regarding: "With a function as getMultipleParameterSamples, I currently see the danger missleading unexperienced users."?

Apparently, no method implemented in PESTO will always work. This I wouldn't see as a problem.