Equivalent genomic coordinates for transcript structure
wlymanambry opened this issue · 2 comments
Is it possible to get equivalent genomic coordinates for the exon structure?
Right now I can get the exon structure which looks something like:
"c_transcript_structure": [
[1, 158], [159, 4138] ],
But I'd like to also know the corresponding genomic positions for those relative transcript positions.
Thanks!
Since you are building JSON - have you see cdot? It provides transcript info in JSON so may be what you need already
Eg check out the JSON from here: http://cdot.cc/transcript/NM_001001890.3
exons are a list of (genomic start, genomic end, exon ID, tx start, tx end, cigar for alignment gaps), eg:
"exons": [
[
36160097,
36164907,
5,
2474,
7283,
null
],
You can download gzipped JSON files
Doing it via Biocommons
In the data provider you should be able to call get_tx_exons
for the transcript_accession + contig for the genomic coordinate
Help docs say it's return is
{
'tes_exon_set_id' : 98390
'aes_exon_set_id' : 298679
'tx_ac' : 'NM_199425.2'
'alt_ac' : 'NC_000020.10'
'alt_strand' : -1
'alt_aln_method' : 'splign'
'ord' : 2
'tx_exon_id' : 936834
'alt_exon_id' : 2999028
'tx_start_i' : 786
'tx_end_i' : 1196
'alt_start_i' : 25059178
'alt_end_i' : 25059588
'cigar' : '410='
}
You look to be using [tx_start_i, tx_end_i]
in your code above, the genomic coords for the contig are [alt_start_i, alt_end_i]
Let me know if this is enough to help you out, so I can close the issue. Thanks!
Yep this is what I was looking for. Thanks so much!