crisprVerse/crisprDesignData

Why is your txdb_human less transcripts?

Closed this issue · 2 comments

For example, you can see in the ensembl:"http://www.ensembl.org/Homo_sapiens/Gene/Summary?db=core;g=ENSG00000187017;r=1:6424776-6461367". It has 13 transcripts that are identified as protein coding for this gene, but you only have three of it.

load("crisprDesignData/data/txdb_human.rda")
txdb_human$transcripts[which(txdb_human$transcripts$gene_symbol=="ESPN")]
GRanges object with 3 ranges and 14 metadata columns:
   seqnames          ranges strand |           tx_id         gene_id      protein_id
      <Rle>       <IRanges>  <Rle> |     <character>     <character>     <character>
       chr1 6424776-6460944      + | ENST00000645284 ENSG00000187017 ENSP00000496593
       chr1 6424788-6456671      + | ENST00000636330 ENSG00000187017 ENSP00000490186
       chr1 6448043-6460399      + | ENST00000461727 ENSG00000187017 ENSP00000465308
          tx_type gene_symbol     exon_id exon_rank cds_start   cds_end  tx_start    tx_end
      <character> <character> <character> <integer> <integer> <integer> <integer> <integer>
   protein_coding        ESPN        <NA>      <NA>      <NA>      <NA>      <NA>      <NA>
   protein_coding        ESPN        <NA>      <NA>      <NA>      <NA>      <NA>      <NA>
   protein_coding        ESPN        <NA>      <NA>      <NA>      <NA>      <NA>      <NA>
     cds_len exon_start  exon_end
   <integer>  <integer> <integer>
        <NA>       <NA>      <NA>
        <NA>       <NA>      <NA>
        <NA>       <NA>      <NA>
  -------
  seqinfo: 25 sequences (1 circular) from hg38 genome

Hi @panxiaoguang,

The object txdb_human only contains the Gencode Basic annotation, therefore why only 3.
You can build your own txdb object with the Gencode Comprehensive annotation using crisprDesign::getTxDb and changing the tx_attrib

Hi @panxiaoguang,

The object txdb_human only contains the Gencode Basic annotation, therefore why only 3. You can build your own txdb object with the Gencode Comprehensive annotation using crisprDesign::getTxDb and changing the tx_attrib

Thank you very much, It's well to use txdb_human if the annotation is correct enough.