marbl/CHM13

CHM13 to grch37/grch38 liftOver

Closed this issue · 8 comments

Dear author,

I was wondering whether the T2T consortium could provide an over.chain file for liftOver of some coordinates from grch37/grch38 to CHM13 V1 draft genome and vice versa. I could create the file based on lastz alignments, but I thought having a publicly available would be useful for other as as well.

Regards,
Sangjin

There is a liftover available from UCSC: http://t2t.gi.ucsc.edu/chm13/hub/t2t-chm13-v1.0/hg38Lastz/t2t-chm13-v1.0.hg38.over.chain.gz courtesy of Mark Diekhans. This liftover is preliminary and we will likely have an improved one in the next few weeks at which point we'll add it to the downloads on the README.

Hi! I was just wondering if there's any update on a chain file for v1.1? Many thanks for the great job with this (:

ekg commented
bw2 commented

Just posting to say +1 and to get notified if hg38Lastz/2t-chm13-v1.0.hg38.over.chain.gz is added for
http://t2t.gi.ucsc.edu/chm13/hub/t2t-chm13-v1.1/
similar to
http://t2t.gi.ucsc.edu/chm13/hub/t2t-chm13-v1.0/

See marbl/CHM13-issues#3: "If you're not looking at the rDNA/telomere, v1.0 should be equivalent. In the meantime, the CHM13 GitHub page has a liftover file from chm13 v1.0 to v1.1 in case you want to translate any data on the browser to v1.1: https://github.com/marbl/CHM13. There's also the gff for both v1.1 and v1.0 there." You can also do a two-step translation from GRCh38 to v1.0 then v1.0 to v1.1.

bw2 commented

I've added hg38 => t2t v1.0 to
liftover.broadinstitute.org

Hi, does anyone know which version of hg38 should be used for this chain?

Hi Team,

Thanks for making all datasets available here. I noticed that there is a huge discrepancy between the two versions of RepeatMasker annotation lists by the number of entries (5M vs 11 M)

5,586,304 rmsk.bigBed (from t2t at UCSC after converting to bed using bigBedToBed)
11,369,721 GCF_009914755.1_T2T-CHM13v2.0_rm.out

I'd appreciate an explanation about the differences. Is it due to use of different repeat libraries or selective inclusion in rmsk.bigBed or something else?

Thank you,
Ping Liang