biocommons/hgvs

Is there a way to get the CDS structure for a given transcript?

Closed this issue · 2 comments

HGVS provides the following transcript information:

Example transcript: NM_001126049.1

"exon_structure": "89618918,89623194",
"relative structure": "[[1, 4277]]",
"tl_start_site": 951,
"tl_stop_site": 1487

I also want the corresponding CDS structure, just as we have the exon structure, something like:
"cds_structure": "89621708,89622244",

We were computing this as (first exon starting position) + (tl_start_site) and (last exon ending position + tl_stop_site) or opposite as in this case, the given that this transcript is on the reverse strand: 89623194 - 951 = 89,622,243 + 1 so the beginning of our CDS structure is: "cds_structure": "89621708,89622244",

The problem we hit doing it this way is that it doesn't account for partially aligning exons. Is there a straightforward way to grab this CDS structure information using the library that will take this into consideration? One out of the box idea I had was checking for the genomic position of C.1 (and C.*1 -1 ). This would at least get the start and stop correct but I'm not sure about all of the other positions.

Thanks!

This issue is stale because it has been open 90 days with no activity. Remove stale label or comment or this will be closed in 7 days.

This issue was closed because it has been stalled for 7 days with no activity.