navinlabcode/copykit

germline gain not called in single cells

Closed this issue · 4 comments

Hello we have as a positive control single cells with Picoplex WGA which carry a ~1.7 Mb germline triplication (total copy number 4). we are having trouble calling it at 220 kb bin size, even though the bins with high copy numbers are visible. We will look at smaller bin sizes, but I wonder if you would expect us to see something like this in a good quality cell? For what it's worth, Ginkgo gives the same results in the cells we compared. I can post output figures or send you files, but you are probably too busy, so no problem if you dont have any specific advice. thanks

see attached- we tried to also run it without marking duplicates, in case this somehow caused the problems, but this didn't help. (I realise that cell L4 has an issue with chr9 and X alignments missing, we will recheck
Screenshot 2022-08-17 at 12 24 31
)

Hi @proukakis,

Yeah, I see the issue.
A region of 1.7Mb should be detected by the 220kb. The most likely reason is that the segmentation parameters are too strict. There are two parameters that we can try to change to solve this issue.

The first one is the alpha from CBS segmentation. This parameter can be controlled during runVarbin(). Could you try to change from the default to a value to a value on the range of 1e-2 to 1e-4 and let me know of the results?

If the alpha does not help, it is possible that the threshold for merge levels is too strict. This parameter is not accessible as an argument as of now, but it really should, and I'll push a correction for this sometime soon so you can test it.

The consequence of changing the parameters will be slightly noisier profiles. Let me know.

hi, sorry for delay (was on holiday). Alpha doesn't seem to help, thanks.
fibroblasts_copykit - Read-Only.pptx

Related to control of alpha parameters for segmentation and merge levels.
Future version of copykit will reduce the sensitivity to avoid this problem