pcgr-summarise: AttributeError: 'dict' object has no attribute 'INFO'
pdiakumis opened this issue · 3 comments
pdiakumis commented
Running latest PCGR v1.3.0 using GRCh38, I think it stumbles on the recent change in pcgr-summarise
(dd037cb), but it might just be the test VCF I'm using (though this hasn't happened with previous PCGR versions):
Version
$ pcgr --version
pcgr 1.3.0
Command
INPUT="../input/pcgr/SEQC-II__SEQC-II_tumour-somatic-somatic.vcf.gz"
SAMPLE="SEQC"
OUTDIR="../out/pcgr/${SAMPLE}"
pcgr \
--debug \
--output_dir "${OUTDIR}" \
--assay "WGS" \
--control_dp_tag "NORMAL_DP" \
--control_af_tag "NORMAL_AF" \
--tumor_dp_tag "TUMOR_DP" \
--tumor_af_tag "TUMOR_AF" \
--estimate_tmb \
--force_overwrite \
--genome_assembly "grch38" \
--input_vcf "${INPUT}" \
--pcgr_dir "../" \
--report_theme "default" \
--sample_id "${SAMPLE}" \
--include_trials \
--estimate_msi_status \
--vep_buffer_size 5000 \
--vep_pick_order "biotype,rank,appris,tsl,ccds,canonical,length,mane" \
--pcgrr_conda umccrise_pcgrr
Log
2023-03-07 12:37:34 - pcgr-validate-input-arguments - INFO - PCGR - STEP 0: Validate input data and options
2023-03-07 12:37:34 - pcgr-validate-input-arguments - INFO - pcgr_validate_input.py /Users/pdiakumis/projects/sigverse /Users/pdiakumis/projects/sigverse/input/pcgr/SEQC-II__SEQC-II_tumour-somatic-somatic.vcf.gz None None None None 1 0 grch38 None TUMOR_DP TUMOR_AF NORMAL_DP NORMAL_AF _NA_ 0 0 --output_dir /Users/pdiakumis/projects/sigverse/out/pcgr/SEQC --debug
2023-03-07 12:37:37 - pcgr-validate-input-arguments - INFO - Skipping validation of VCF file (deprecated as of Dec 2021)
2023-03-07 12:37:37 - pcgr-validate-input-arguments - INFO - Checking if existing INFO tags of query VCF file coincide with PCGR INFO tags
2023-03-07 12:37:37 - pcgr-validate-input-arguments - INFO - No query VCF INFO tags coincide with PCGR INFO tags
2023-03-07 12:37:37 - pcgr-validate-input-arguments - INFO - Found INFO tag for normal/control allelic fraction (control_af_tag NORMAL_AF) in input VCF
2023-03-07 12:37:37 - pcgr-validate-input-arguments - INFO - Found INFO tag for normal/control variant sequencing depth (control_dp_tag NORMAL_DP) in input VCF
2023-03-07 12:37:37 - pcgr-validate-input-arguments - INFO - Found INFO tag for tumor variant allelic fraction (tumor_af_tag TUMOR_AF) in input VCF
2023-03-07 12:37:37 - pcgr-validate-input-arguments - INFO - Found INFO tag for tumor variant sequencing depth (tumor_dp_tag TUMOR_DP) in input VCF
2023-03-07 12:37:37 - pcgr-validate-input-arguments - INFO - Extracting variants on autosomal/sex/mito chromosomes only (1-22,X,Y, M/MT) with bcftools
2023-03-07 12:37:37 - pcgr-validate-input-arguments - INFO - bcftools view /Users/pdiakumis/projects/sigverse/input/pcgr/SEQC-II__SEQC-II_tumour-somatic-somatic.vcf.gz | bgzip -cf > /Users/pdiakumis/projects/sigverse/out/pcgr/SEQC/SEQC-II__SEQC-II_tumour-somatic-somatic.pcgr_ready.tmp2.vcf.gz && tabix -p vcf /Users/pdiakumis/projects/sigverse/out/pcgr/SEQC/SEQC-II__SEQC-II_tumour-somatic-somatic.pcgr_ready.tmp2.vcf.gz && bcftools sort --temp-dir /Users/pdiakumis/projects/sigverse/out/pcgr/SEQC -Oz /Users/pdiakumis/projects/sigverse/out/pcgr/SEQC/SEQC-II__SEQC-II_tumour-somatic-somatic.pcgr_ready.tmp2.vcf.gz > /Users/pdiakumis/projects/sigverse/out/pcgr/SEQC/SEQC-II__SEQC-II_tumour-somatic-somatic.pcgr_ready.tmp3.vcf.gz 2> /Users/pdiakumis/projects/sigverse/out/pcgr/SEQC/bcftools_1.pcgr_simplify_vcf.log && tabix -p vcf /Users/pdiakumis/projects/sigverse/out/pcgr/SEQC/SEQC-II__SEQC-II_tumour-somatic-somatic.pcgr_ready.tmp3.vcf.gz
2023-03-07 12:37:39 - pcgr-validate-input-arguments - INFO - bcftools view --regions chr1,chr2,chr3,chr4,chr5,chr6,chr7,chr8,chr9,chr10,chr11,chr12,chr13,chr14,chr15,chr16,chr17,chr18,chr19,chr20,chr21,chr22,chrX,chrY,chrM,chrMT,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,X,Y,M,MT /Users/pdiakumis/projects/sigverse/out/pcgr/SEQC/SEQC-II__SEQC-II_tumour-somatic-somatic.pcgr_ready.tmp3.vcf.gz | egrep -v '^##FORMAT=' | cut -f1-8 | sed 's/^chr//' > /Users/pdiakumis/projects/sigverse/out/pcgr/SEQC/SEQC-II__SEQC-II_tumour-somatic-somatic.pcgr_ready.tmp1.vcf
2023-03-07 12:37:39 - pcgr-validate-input-arguments - INFO - All sites seem to be decomposed - skipping decomposition!
2023-03-07 12:37:39 - pcgr-validate-input-arguments - INFO - cp /Users/pdiakumis/projects/sigverse/out/pcgr/SEQC/SEQC-II__SEQC-II_tumour-somatic-somatic.pcgr_ready.tmp1.vcf /Users/pdiakumis/projects/sigverse/out/pcgr/SEQC/SEQC-II__SEQC-II_tumour-somatic-somatic.pcgr_ready.vcf
2023-03-07 12:37:39 - pcgr-validate-input-arguments - INFO - bgzip -f /Users/pdiakumis/projects/sigverse/out/pcgr/SEQC/SEQC-II__SEQC-II_tumour-somatic-somatic.pcgr_ready.vcf
2023-03-07 12:37:39 - pcgr-validate-input-arguments - INFO - tabix -p vcf /Users/pdiakumis/projects/sigverse/out/pcgr/SEQC/SEQC-II__SEQC-II_tumour-somatic-somatic.pcgr_ready.vcf.gz
2023-03-07 12:37:39 - pcgr-validate-input-arguments - INFO - Finished pcgr-validate-input-arguments
----
2023-03-07 12:37:39 - pcgr-start - INFO - --- Personal Cancer Genome Reporter workflow ----
2023-03-07 12:37:39 - pcgr-start - INFO - Sample name: SEQC
2023-03-07 12:37:39 - pcgr-start - INFO - Tumor type: Any
2023-03-07 12:37:39 - pcgr-start - INFO - Sequencing assay - type: WGS
2023-03-07 12:37:39 - pcgr-start - INFO - Sequencing assay - mode: Tumor vs. Control
2023-03-07 12:37:39 - pcgr-start - INFO - Sequencing assay - coding target size: 34Mb
2023-03-07 12:37:39 - pcgr-start - INFO - Genome assembly: grch38
2023-03-07 12:37:39 - pcgr-start - INFO - Mutational signature estimation: OFF
2023-03-07 12:37:39 - pcgr-start - INFO - MSI classification: ON
2023-03-07 12:37:39 - pcgr-start - INFO - Mutational burden estimation: ON
2023-03-07 12:37:39 - pcgr-start - INFO - Include molecularly targeted clinical trials (beta): ON
----
2023-03-07 12:37:39 - pcgr-vep - INFO - PCGR - STEP 1: Basic variant annotation with Variant Effect Predictor (105, GENCODE 39, grch38)
2023-03-07 12:37:39 - pcgr-vep - INFO - VEP configuration - one primary consequence block pr. alternative allele (--flag_pick_allele)
2023-03-07 12:37:39 - pcgr-vep - INFO - VEP configuration - transcript pick order: biotype,rank,appris,tsl,ccds,canonical,length,mane
2023-03-07 12:37:39 - pcgr-vep - INFO - VEP configuration - transcript pick order: See more at https://www.ensembl.org/info/docs/tools/vep/script/vep_other.html#pick_options
2023-03-07 12:37:39 - pcgr-vep - INFO - VEP configuration - GENCODE set: GENCODE - basic transcript set (--gencode_basic)
2023-03-07 12:37:39 - pcgr-vep - INFO - VEP configuration - skip intergenic: FALSE
2023-03-07 12:37:39 - pcgr-vep - INFO - VEP configuration - regulatory annotation: OFF
2023-03-07 12:37:39 - pcgr-vep - INFO - VEP configuration - buffer_size/number of forks: 5000/4
2023-03-07 12:37:39 - pcgr-vep - INFO - unset PERL5LIB && export PATH=/Users/pdiakumis/projects/umccrise/conda/umccrise_pcgr/bin:"$PATH" && vep --input_file /Users/pdiakumis/projects/sigverse/out/pcgr/SEQC/SEQC-II__SEQC-II_tumour-somatic-somatic.pcgr_ready.vcf.gz --output_file /Users/pdiakumis/projects/sigverse/out/pcgr/SEQC/SEQC-II__SEQC-II_tumour-somatic-somatic.pcgr_ready.vep.vcf.gz --dir /Users/pdiakumis/projects/sigverse/data/grch38/.vep --assembly GRCh38 --cache_version 105 --fasta /Users/pdiakumis/projects/sigverse/data/grch38/.vep/homo_sapiens/105_GRCh38/Homo_sapiens.GRCh38.dna.primary_assembly.fa.gz --pick_order biotype,rank,appris,tsl,ccds,canonical,length,mane --buffer_size 5000 --fork 4 --hgvs --af --af_1kg --af_gnomad --variant_class --domains --symbol --protein --ccds --mane --uniprot --appris --biotype --tsl --canonical --format vcf --cache --numbers --total_length --allele_number --no_stats --no_escape --xref_refseq --vcf --check_ref --dont_skip --flag_pick_allele_gene --plugin NearestExonJB,max_range=50000 --force_overwrite --species homo_sapiens --offline --compress_output bgzip --verbose --gencode_basic
Possible precedence issue with control flow operator at /Users/pdiakumis/projects/umccrise/conda/umccrise_pcgr/lib/site_perl/5.26.2/Bio/DB/IndexedBase.pm line 805.
2023-03-07 12:37:56 - pcgr-vep - INFO - tabix -f -p vcf /Users/pdiakumis/projects/sigverse/out/pcgr/SEQC/SEQC-II__SEQC-II_tumour-somatic-somatic.pcgr_ready.vep.vcf.gz
2023-03-07 12:37:56 - pcgr-vep - INFO - Finished pcgr-vep
----
2023-03-07 12:37:56 - pcgr-vcfanno - INFO - PCGR - STEP 2: Annotation for precision oncology with pcgr-vcfanno
2023-03-07 12:37:56 - pcgr-vcfanno - INFO - Annotation sources: ClinVar, dbNSFP, UniProtKB, cancerhotspots.org, CiVIC, CGI, DoCM, CHASMplus driver mutations, TCGA, ICGC-PCAWG
2023-03-07 12:37:56 - pcgr-vcfanno - INFO - pcgr_vcfanno.py /Users/pdiakumis/projects/sigverse/out/pcgr/SEQC/SEQC-II__SEQC-II_tumour-somatic-somatic.pcgr_ready.vep.vcf.gz /Users/pdiakumis/projects/sigverse/out/pcgr/SEQC/SEQC-II__SEQC-II_tumour-somatic-somatic.pcgr_ready.vep.vcfanno.vcf /Users/pdiakumis/projects/sigverse/data/grch38 --num_processes 4 --chasmplus --dbnsfp --docm --clinvar --icgc --civic --cgi --tcga_pcdm --winmsk --simplerepeats --tcga --uniprot --cancer_hotspots --pcgr_onco_xref --debug --keep_logs
2023-03-07 12:37:57 - pcgr-vcfanno - INFO - vcfanno -p=4 /Users/pdiakumis/projects/sigverse/out/pcgr/SEQC/SEQC-II__SEQC-II_tumour-somatic-somatic.pcgr_ready.vep.vcfanno.vcf.tmp.conf.toml /Users/pdiakumis/projects/sigverse/out/pcgr/SEQC/SEQC-II__SEQC-II_tumour-somatic-somatic.pcgr_ready.vep.vcf.gz > /Users/pdiakumis/projects/sigverse/out/pcgr/SEQC/SEQC-II__SEQC-II_tumour-somatic-somatic.pcgr_ready.vep.vcfanno.vcf.tmp.unsorted.1 2> /Users/pdiakumis/projects/sigverse/out/pcgr/SEQC/SEQC-II__SEQC-II_tumour-somatic-somatic.pcgr_ready.vep.vcfanno.log
2023-03-07 12:37:59 - pcgr-vcfanno - INFO - bgzip -f /Users/pdiakumis/projects/sigverse/out/pcgr/SEQC/SEQC-II__SEQC-II_tumour-somatic-somatic.pcgr_ready.vep.vcfanno.vcf
2023-03-07 12:37:59 - pcgr-vcfanno - INFO - tabix -f -p vcf /Users/pdiakumis/projects/sigverse/out/pcgr/SEQC/SEQC-II__SEQC-II_tumour-somatic-somatic.pcgr_ready.vep.vcfanno.vcf.gz
2023-03-07 12:37:59 - pcgr-vcfanno - INFO - Finished pcgr-vcfanno
----
2023-03-07 12:37:59 - pcgr-summarise - INFO - PCGR - STEP 3: Cancer gene annotations with pcgr-summarise
2023-03-07 12:37:59 - pcgr-summarise - INFO - pcgr_summarise.py /Users/pdiakumis/projects/sigverse/out/pcgr/SEQC/SEQC-II__SEQC-II_tumour-somatic-somatic.pcgr_ready.vep.vcfanno.vcf.gz 0 0 /Users/pdiakumis/projects/sigverse/data/grch38 --debug
Traceback (most recent call last):
File "/Users/pdiakumis/projects/umccrise/conda/umccrise_pcgr/bin/pcgr_summarise.py", line 165, in <module>
__main__()
File "/Users/pdiakumis/projects/umccrise/conda/umccrise_pcgr/bin/pcgr_summarise.py", line 28, in __main__
extend_vcf_annotations(args.vcf_file, args.pcgr_db_dir, logger, args.pon_annotation, args.regulatory_annotation, args.cpsr, args.debug)
File "/Users/pdiakumis/projects/umccrise/conda/umccrise_pcgr/bin/pcgr_summarise.py", line 138, in extend_vcf_annotations
rec.INFO[k] = record[k]
AttributeError: 'dict' object has no attribute 'INFO'
sigven commented
I am on it👍
sigven commented
@pdiakumis, Could you test with https://github.com/sigven/pcgr/tree/pick_trans_consequence_patch?
pdiakumis commented
Works splendidly, thanks a lot ;-)