Error in names(x) <- value : 'names' attribute [2] must be the same length as the vector [1]

Question

Error in names(x) <- value : 'names' attribute [2] must be the same length as the vector [1]

Opened this issue 4 years ago · 2 comments

I was trying to generate summary stats with the following run but encountered the following error:

slidingRuns <- slidingRUNS.run(genotypeFile = genotypeFilePath, mapFile = mapFilePath, windowSize = 15, threshold = 0.05, minSNP = 20, ROHet = FALSE, maxOppWindow = 1, maxMissWindow = 1, maxGap = 10^6, minLengthBps = 250000, minDensity = 1/10^3, maxOppRun = NULL,maxMissRun = NULL)
pesummaryList <- summaryRuns(runs = slidingRuns, mapFile = pemapFilePath, genotypeFile = pegenotypeFilePath, Class = 6, snpInRuns = TRUE)
Checking files...
Using class: 6
Total genome length: 274744571
calculating Froh on all genome
Total genome length: 274744571
calculating Froh chromosome by chromosome
Error in names(x) <- value :
  'names' attribute [2] must be the same length as the vector [1]

To Reproduce
I am working with the output from BCFtools but also encountered the same error after using the native slidingRUNS.run function. Please find the .map and .ped here:
https://drive.google.com/drive/folders/1oMJdU2tEf1XcLOHEbtEGL8Q9vsDb6K5l?usp=sharing

detectRUNS version
-v 0.9.6

platform
platform x86_64-conda-linux-gnu
arch x86_64
os linux-gnu
system x86_64, linux-gnu
status
major 4
minor 0.3
year 2020
month 10
day 10
svn rev 79318
language R
version.string R version 4.0.3 (2020-10-10)
nickname Bunny-Wunnies Freak Out

Answer 1 · 2021-03-08T11:23:00.000Z

Dear @kumarsaurabh20 ,

Thank you for your interest in detectRUNS. Your problem is related to the chromosome names in your map file, since the function Froh_inbreeding (not exported) called by summaryRuns currently accept chromosomes in this format c((0:99),"X","Y","XY","MT","Z","W") (see Froh_inbreeding implementation for more info) while your data have chromosomes like this:

$ cut -f1 pe.map.txt | sort | uniq 
HiC_scaffold_2
HiC_scaffold_3
HiC_scaffold_4
HiC_scaffold_5
HiC_scaffold_6

I think we can plan to extend support to extra chromosomes (like scaffolds) in the future. Btw, in your case you could try to rename your cromosome names and since your chromosomes have the same prefix, the simplest solution is to use sed (in a bash terminal):

$ sed 's/HiC_scaffold_//g' pe.map.txt > pe.map.txt.fix

This will change your HiC_scaffold_2 to 2 which is currently supported in detectRUNS:

genotypeFilePath <- "pe.ped.txt"
mapFilePath <- "pe.map.txt.fix"
slidingRuns <- slidingRUNS.run(
  genotypeFile = genotypeFilePath, mapFile = mapFilePath, windowSize = 15,
  threshold = 0.05, minSNP = 20, ROHet = FALSE, maxOppWindow = 1,
  maxMissWindow = 1, maxGap = 10^6, minLengthBps = 250000, minDensity = 1/10^3,
  maxOppRun = NULL,maxMissRun = NULL)
pesummaryList <- summaryRuns(
  runs = slidingRuns, mapFile = mapFilePath, genotypeFile = genotypeFilePath, 
  Class = 6, snpInRuns = TRUE)

After that, you could add the HiC_scaffold_ prefix to your chromosome in the resulting dataframes using R in order to have consistency between your input and output data.

Hope this helps

Answer 2 · 2021-03-08T13:22:03.000Z

@bunop Many thanks for your prompt reply. It worked.