Gene ID dropdown not showing options with gff option

Question

Gene ID dropdown not showing options with gff option

preetida opened this issue 5 years ago · 7 comments

Hi Joe,
I am using covviz for ~1500 samples, after generating bed file with golfer indexcov, I am running covviz by following command.

_covviz --ped testcoviz/testcoviz-indexcov.ped --gff ~/scratch/Homo_sapiens.GRCh38.99.gtf.gz testcoviz/testcoviz-indexcov.bed.gz -o CHIP_Coverage

In my html output I don't know see options in GeneID dropdown. Do I have to specify the gene feature ? in option?

referring to this line on documentation "Currently we support GFF, VCF, and BED. GFF tracks are added using --gff where features are 'gene' and attributes have 'Name='. Feature type and attribute regex can be configured using --gff-feature and --gff-attr."

Answer 1 · 2020-04-01T20:27:36.000Z

Yes, probably just need to use a new pattern to grab the gene IDs from the feature field of the GFF.

--gff-feature refers to column 3. The default it's looking for are lines annotated as 'gene'.

--gff-attr refers to the column 9 and by default it'll look to split the gene name using 'Name='.

Could you send a few lines of your gtf or point me to where your annotation was downloaded from?

Answer 2 · 2020-04-01T21:12:49.000Z

Downloaded from here : ftp://ftp.ensembl.org/pub/release-99/gtf/homo_sapiens/
here is the few lines of file.
#!genebuild-last-updated 2019-08
1 havana gene 11869 14409 . + . gene_id "ENSG00000223972"; gene_version "5"; gene_name "DDX11L1"; gene_source "havana"; gene_biotype "transcribed_unprocessed_pseudogene";

Answer 3 · 2020-04-01T21:33:11.000Z

Try covviz with --gff-attr "gene_name ".

I'm pretty sure that'll leave the quote marks. I'll push an update to handle removing any remaining quote marks after the regex has been applied.

Answer 4 · 2020-04-01T22:47:04.000Z

Thanks that worked !
Though its has " " in the geneID dropdown.

On separate issue, it only highlight the sample, and not others, any clue why?

Answer 5 · 2020-04-01T22:54:45.000Z

Here are the files : ped file, bed file generated with indexcov and output. Struggling with reading the output, for e.g in terms of 1x, 5x, 10x, how do i say certain sample has 1X coverage ? testcoviz-indexcov.bed.gz <https://drive.google.com/file/d/1_-2OuoJwXDolwSJK3CUMvIV49EBx6sZB/view?usp=drive_web> Here is the out put Can you suggest how to interpret this? thanks,

Answer 6 · 2020-04-02T00:20:54.000Z

Since the input bed file is coming from indexcov, can you include --skip-norm in your covviz call and see how that looks. Or maybe coverage is really poor in a lot of the samples.

Coverage is normalized to 1x per sample from indexcov, so you won't be able to say precise depths per sample. You can analyze each alignment file to get coverages pretty rapidly with https://github.com/brentp/mosdepth.

What you're seeing with the highlight (green line) is a sample that deviates significantly from the rest of the cohort through those coordinates. We had to do it this way because if you attempt to draw all ~1500 lines for all points along the x-axis, the browser will not behave and likely run out of RAM in the process. The upper and lower bounds and everything in the middle shaded gray represents the other ~1499 samples.

That should give you some sense how to interpret these outputs, but please follow up if I'm still unclear.

Answer 7 · 2020-04-02T00:31:45.000Z

I pushed a new release to address the quotes. You can install via pip (pip install -U covviz).