mira.tl.get_distance_to_TSS(): human genome equivalent for parameter : genome_file
Closed this issue · 2 comments
Hi,
It is not clear to me what the mm10.genome.sorted file data represents in the following code from your tutorial. It appears to be the chromosome sizes? I want to run this analysis for a human multiome dataset- where can I find the equivalent information for human genome please?
mira.tl.get_distance_to_TSS(atac_main,
tss_data=tss_data,
peak_chrom='chr',
peak_start='start',
peak_end='end',
gene_id='geneSymbol',
gene_chrom='chrom',
gene_strand='strand',
gene_start='txStart',
gene_end='txEnd',
**genome_file='mm10.genome.sorted')
Would the following work for human genome?
chr1 248956422
chr2 242193529
chr3 198295559
chr4 190214555
chr5 181538259
chr6 170805979
chr7 159345973
chr8 145138636
chr9 138394717
chr10 133797422
chr11 135086622
chr12 133275309
chr13 114364328
chr14 107043718
chr15 101991189
chr16 90338345
chr17 83257441
chr18 80373285
chr19 80373285
chr20 64444167
chr21 46709983
chr22 50818468
X 56040895
Y 57227415
Thanks so much for your help.
thanks very much!