Illumina/REViewer

[error] Failed to extract reads from the specified region

Closed this issue · 5 comments

Hello!

I have a problem while executing the main command. Option --locus as far as I understood requires an appropriate string of LocusID from the EH's outfile. For instance, as it follows in my result file (.json) (a fragment):
.. }, "ATXN10": { "AlleleCount": 2, "Coverage": 0.32432432432432434, "FragmentLength": 215, "LocusId": "ATXN10", "ReadLength": 150, "Variants": { "ATXN10": { "CountsOfFlankingReads": "()", "CountsOfInrepeatReads": "()", "CountsOfSpanningReads": "()", "ReferenceRegion": "chr22:46191234-46191304", "RepeatUnit": "ATTCT", "VariantId": "ATXN10", "VariantSubtype": "Repeat", "VariantType": "Repeat" } } ..
I tried to paste ATXN10 as an argument of --locus option, but have not gotten any success.

*btw, I have looked at Issue #2, particularly at the type of command from bw2. In his/her case, he/she got the EH's outfile with specified contig name and coordinates in LocusID:
"LocusId": "FXN-chr9-69037286-69037304-GAA". That makes me worried about that: possibly my outfile from EH is wrong?

Thank you so much!

Thank you for reporting the issue! It looks like you are doing everything correctly. EH was just unable to analyze this locus due to very low coverage ("Coverage": 0.32) and lack of informative reads ("CountsOfFlankingReads": "()", "CountsOfInrepeatReads": "()", "CountsOfSpanningReads": "()"). This repeat should be marked as "LowDepth" in the corresponding VCF file. EH requires that the read depth is at least 10.

The next version of REViewer will produce a better error message in such cases.

It sounds great! Thanks a lot, Egor!

Great! Happy to help!

Sorry, it's me again. I have checked another one:
}, "ATXN3": { "AlleleCount": 2, "Coverage": 11.837837837837837, "FragmentLength": 334, "LocusId": "ATXN3", "ReadLength": 150, "Variants": { "ATXN3": { "CountsOfFlankingReads": "()", "CountsOfInrepeatReads": "()", "CountsOfSpanningReads": "()", "Genotype": "0/0", "GenotypeConfidenceInterval": "0-714/0-714", "ReferenceRegion": "chr14:92537353-92537386", "RepeatUnit": "GCT", "VariantId": "ATXN3", "VariantSubtype": "Repeat", "VariantType": "Repeat" }
Here the coverage seems good, doesn't it? However, I have the same error(.

I have this problem too, although my coverage is over 30:

"LocusResults": {
    "AFF2": {
      "AlleleCount": 2,
      "Coverage": 31.54054054054054,
      "FragmentLength": 352,
      "LocusId": "AFF2",
      "ReadLength": 150,
      "Variants": {
        "AFF2": {
          "CountsOfFlankingReads": "()",
          "CountsOfInrepeatReads": "()",
          "CountsOfSpanningReads": "()",
          "Genotype": "0/0",
          "GenotypeConfidenceInterval": "0-714/0-714",
          "ReferenceRegion": "X:147582151-147582211",
          "RepeatUnit": "GCC",
          "VariantId": "AFF2",
          "VariantSubtype": "Repeat",
          "VariantType": "Repeat"
        }
      }
    },

Not sure what's going on. I removed the entry from the catalog but got the same error still, at this one (the first one):

"AR": {
      "AlleleCount": 2,
      "Coverage": 41.91891891891891,
      "FragmentLength": 350,
      "LocusId": "AR",
      "ReadLength": 150,
      "Variants": {
        "AR": {
          "CountsOfFlankingReads": "(1, 2), (2, 2), (3, 3), (4, 1), (5, 1), (6, 2), (8, 6), (9, 1), (10, 3), (15, 3), (17, 1), (18, 3), (19, 5), (20, 2), (23, 1), (24, 1), (26, 1), (28, 1)",
          "CountsOfInrepeatReads": "()",
          "CountsOfSpanningReads": "(27, 16), (28, 9)",
          "Genotype": "27/28",
          "GenotypeConfidenceInterval": "27-27/28-28",
          "ReferenceRegion": "X:66765158-66765227",
          "RepeatUnit": "GCA",
          "VariantId": "AR",
          "VariantSubtype": "Repeat",
          "VariantType": "Repeat"
        }
      }
    },

I thought maybe the problem was that there are no InrepeatReads, but the problem remains if I only include catalog entries where all three read categories have members, e.g:

"LocusResults": {
    "RFC1": {
      "AlleleCount": 2,
      "Coverage": 41.513513513513516,
      "FragmentLength": 351,
      "LocusId": "RFC1",
      "ReadLength": 150,
      "Variants": {
        "RFC1": {
          "CountsOfFlankingReads": "(1, 2), (2, 2), (3, 3), (4, 2), (5, 2), (7, 2), (8, 5), (10, 2), (11, 1), (15, 1), (16, 1), (17, 2), (20, 1), (21, 1), (22, 1), (25, 1), (29, 4)",
          "CountsOfInrepeatReads": "(30, 15), (31, 1), (32, 1)",
          "CountsOfSpanningReads": "(10, 11), (15, 1)",
          "Genotype": "10/65",
          "GenotypeConfidenceInterval": "10-10/53-102",
          "ReferenceRegion": "4:39350044-39350099",
          "RepeatUnit": "AARRG",
          "VariantId": "RFC1",
          "VariantSubtype": "Repeat",
          "VariantType": "Repeat"
        }
      }
    }
  }

[2022-03-25 14:42:00.258] [info] Loading specification of locus RFC1
[2022-03-25 14:42:00.260] [error] Failed to extract reads from the specified region

What to do? :-(

REViewer-v0.2.7-linux_x86_64\
	--reads "$out"/exphout_"${sample}"_realigned.sorted.bam\
	--reference "$genome"\
	--catalog "$catalog"\
	--vcf exphout_"${sample}".vcf\
	--locus "RFC1"\
	--output-prefix "$out"/reviewerout_"${sample}"

Grateful for help with this! Am excited to see the pretty diagrams...