Add pooling options to Q2 workflows

Question

Add pooling options to Q2 workflows

Closed this issue a year ago · 8 comments

Improvement Description
Add a new option that allows users to pick independent sample processing (as done currently), pooled sample processing, or "pseudo-pooling" that was added in 1.7.5. It probably makes sense to wait until the R package 1.8 release is available (~June) to add this.

The pooling options provide better detection of rare per-sample variants at the cost of increased computation time.

Also consider making pseudo-pooling the default processing mode.

References
"pseudo-pooling" that was added in 1.7.5

Answer 1 · 2019-02-13T16:18:24.000Z

forum xref

Answer 2 · 2019-03-15T23:38:34.000Z

Question: Can default parameter choices be dependent on other parameter choices?

The reason I ask: Pooled chimera removal is better if pooled sample inference is performed, but the default chimera removal is consensus, which is better for the default sample inference method (independent). So, can chimera removal be defaulted to pooled if the user selects pooled sample inference?

Answer 3 · 2019-03-18T20:14:48.000Z

Not really. There would be a way to refine the types based on other types passed (should be available next week-ish), but that would categorically prevent mixing the two.

A different approach would be to have the two steps be separate actions, and then in a pipeline which composes them, you have a "simpler" argument which unifies the two arguments. That way, the "default" invocation does the ideal thing for inference and chimera checking, but mixing them is still possible if you run the sub-actions directly.

Answer 4 · 2019-03-18T21:14:54.000Z

My current idea is to change the default chimera method to "auto", which chooses "consensus" or "pooled" chimera removal depending on the choice made at the sample inference step. Users will still be able to define the chimera removal method themselves in which case that choice will be used.

That achieves my goal here of defaulting to the "right" chimera removal method for each sample inference method, but let me know if that seems a bad idea.

Answer 5 · 2019-03-18T22:08:40.000Z

That works too! There are a few places where we have similar patterns.

Answer 6 · 2019-07-17T12:28:45.000Z

Shoot, I didn't have this Q2 release on my calendar and it looks like PRs are due July 22. It would be really nice to get pseudo-pooling in though, as I know there are a decent number of people interested in that feature. I'll see if I can squeeze some time in, but I can't promise anything.

…

On Wed, Jul 17, 2019 at 6:33 AM yanxianl ***@***.***> wrote: Hi, will pseudo-pooling and pooling make it to the coming release of QIIME2-2019.7? I'm looking forward to using pseudo-pooling for my dataset within QIIME2, which otherwise has to be done in R instead. — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#87?email_source=notifications&email_token=ABMHKVF3W2NNGHXLM5XMICLP73YPNA5CNFSM4EPAFB6KYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD2DYTGA#issuecomment-512199064>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ABMHKVAN6PGZY4C34VKZVU3P73YPNANCNFSM4EPAFB6A> .

Answer 7 · 2019-07-17T14:22:24.000Z

Thanks @benjjneb --- we can try and coordinate efforts, too --- if you want to pass things off in a semi-usable state one of us can probably run it across the finish line.

Answer 8 · 2019-11-26T22:51:20.000Z

The R code for pseudo-pooling in the Q2 plugin is working on my end in #122