Uncertainty of estimated proportions
kvittingseerup opened this issue · 3 comments
kvittingseerup commented
I was just wondering if it is possible to use the posterior distribution to estimate a confidence/credible interval on the estimated cell type proportions?
Cheers
Kristoffer
tinyi commented
Dear Kristoffer,
Sorry for the delay.
This is a great question. Confidence interval is not possible, as it is
frequentist. Credible interval is possible, but the current version does
not output the full posterior samples. I have given some thought about
whether to output the full posterior when developing the package, and
decided not to do so. My thoughts are as follows.
1) To save the memory so that the memory overhead does not scale up as a
function of MCMC chain length.
2) Often the posterior is heavily peaked due to the high sequencing depth
(>=1E8) for a typical RNA-seq dataset.
3) There would be very few applications involving the full posterior. Two
potential applications that I would imagine would be 1) to test if the
fraction of some cell type > 0, or 2) to test if the fraction of cell type
A is greater in sample B than that in sample C. These, however, should be
tested on biological replicates rather than on MCMC samples, if one thinks
carefully about the actual null hypothesis being tested. Besides, due to
the dirichlet distribution, all fractions are by nature strictly positive,
so it does not make sense to set up test for the "cell type > 0"
hypothesis.
Let me know if this answers your questions. Also, feel free to share your
thoughts on the application on the full posterior. We may also consider
adding this feature for future development.
Best,
Tinyi
…On Fri, Feb 7, 2020 at 3:06 AM Kristoffer Vitting-Seerup < ***@***.***> wrote:
I was just wondering if it is possible to use the posterior distribution
to estimate a confidence/credible interval on the estimated cell type
proportions?
Cheers
Kristoffer
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#2?email_source=notifications&email_token=AB4NHS3WK5YJTVKAGCZ526TRBUJALA5CNFSM4KRKHB62YY3PNVWWK3TUL52HS4DFUVEXG43VMWVGG33NNVSW45C7NFSM4ILXROZQ>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AB4NHS2YODF4CXDHEPNKLC3RBUJALANCNFSM4KRKHB6Q>
.
kvittingseerup commented
Hi Tinyi
Thanks for the detailed response - I think your arguments are sound.
Guess the only real usecase would be cases where you have rare cell populations and the difference between existing and non-existing is crucial but hard to determine without the credibility intervals (as all fraction values per default >0).
If you wanted it could be an optional argument (defaulting to false) together with a workflow in the vignette to showcase how to do it.
Cheers
Kristoffer
tinyi commented
Hi Kristoffer,
In your case, we should not test the credible interval on the posterior
distribution, as dirichlet is strictly positive. We would have to test
using the fraction inferred from multiple biological replicates, and test
theta > epsilon, for some epsilon > 0.
Best,
Tinyi
…On Wed, Feb 12, 2020 at 4:28 AM Kristoffer Vitting-Seerup < ***@***.***> wrote:
Hi Tinyi
Thanks for the detailed response - I think your arguments are sound.
Guess the only real usecase would be cases where you have rare cell
populations and the difference between existing and non-existing is crucial
but hard to determine without the credibility intervals (as all fraction
values per default >0).
If you wanted it could be an optional argument (defaulting to false)
together with a workflow in the vignette to showcase how to do it.
Cheers
Kristoffer
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#2?email_source=notifications&email_token=AB4NHSZSDXV5MGUOVWN65Z3RCO6KRA5CNFSM4KRKHB62YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOELQB2CY#issuecomment-585112843>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AB4NHS5LU76UKENGPEBA7NLRCO6KRANCNFSM4KRKHB6Q>
.