estepi/ASpli

Getting gene names from genome TxDb using T2T genome

Opened this issue · 0 comments

Hi, thanks for making ASpli!

So far, I've generated an integrated signals output with a list of AS events, which looks great. But, instead of gene names I just get the coordinates and locus information (e.g. CHM13_G0011354).

I'm using a T2T assembly and generated the genome file using this code:
genomeTxDb <- makeTxDbFromGFF(file = "Z:/Genome_files/chm13v2.0_GENCODEv35_CAT_Liftoff.vep.gff3", format = "gff3", organism = "Homo sapiens")

To add gene names in, in the ASpli documentation I found this code:
symbols <- data.frame( row.names = genes( aTxDb ), symbol = paste( 'This is symbol of gene:', genes( aTxDb ) ) )
features <- binGenome( aTxDb, geneSymbols = symbols )

But when I try something like this I get the error:

Error in data.frame(row.names = genes(genomeTxDb), symbol = paste("This is symbol of gene:", :
duplicate row.names: chr1:201631648-201632266:-, chr15:18551285-18551711:-, chr15:80271240-80284203:+

Does anyone know how I can get around this so that I can append the gene symbols?

Thanks!
Katie