How to deal with technical duplication
Closed this issue · 1 comments
Dear all,
Because of low sequencing depths for some samples in my study, I have constructed another library for these samples and sequenced them to a high coverage. When preparing the input files to dcHiC
, I'm not sure I'm doing the right thing. For these samples, I simply merged the validPairs not the allValidPairs by using cat
, and then converted the merged file to sparse matrix/bed files by using buildmatrix
.
Sincerely.
Zheng zhuqing
Hi,
You can treat each validpair file as an independent replicate. For that you can create count.matrix_1 from validpair1 and matrix_2 from validpair2 using buildmatrix program separately. If you provide the replicates then dcHiC can utilize the intra-replicate variation present in your dataset. Concatenation is the next best option. Let us know how the result looks like and makes sense to you.