Best practice for estimating contamination within the same ancestry

Question

Best practice for estimating contamination within the same ancestry

Han-Cao opened this issue a year ago · 1 comments

Hi,

I am wondering what is the best way to estimate contamination for samples with the same ancestry background.

In my work, all samples are East Asian. If I want to estimate the cross contamination among them, shall I use the "--WithinAncestry" parameter? Moreover, is it correct if I use an East Asian only reference? Would it give better results than a multi-ancestry reference like 1000G?

Thanks,
Han

Answer 1 · 2023-10-19T18:38:02.000Z

Hi Han,
The best practice I would recommend is to use the default settings with ‘between-ancestry’ model and a diverse reference panel like the 1000g in the repo, because we’re trying to detect unexpected contamination events.
However, if you have confidence in your setup, or finer resolution in your populations, you should consider using your customized reference panel.