apriha/snps

AttributeError: 'str' object has no attribute '_output_dir'

Opened this issue · 9 comments

I'm using aws ec2 ubuntu. It does not allow me to create an individual.

user662 = l.create_individual('User662', '/home/ubuntu/myprojectdir/AaronAzuma.zip')
Traceback (most recent call last):
File "", line 1, in
File "/home/ubuntu/myprojectdir/venv/lib/python3.8/site-packages/lineage/init.py", line 96, in create_individual
return Individual(name, raw_data, self._output_dir, **kwargs)
AttributeError: 'str' object has no attribute '_output_dir'

Thanks for the issue. Can you provide more details or code snippets? I just tested installing and running the README examples in a Python 3.8 virtual environment without any issues.

Thanks LaKisha, that helps. lineage uses the snps library to parse files, so I transferred the issue here.

snps should be able to read raw AncestryDNA or 23andMe files without conversion... However, snps could be updated to handle the format you pasted as well. Do you have a link to the tool that produces that format?

As for the H3Africa files, can you confirm that an example file would look like this (tab-separated):

rs1	1	101	AA
rs2	1	102	CC
rs3	1	103	GG
rs4	1	104	TT
rs5	1	105	--
rs6	1	106	GC
rs7	1	107	TC
rs8	1	108	AT
.
.
.

Thanks LaKisha. The issue with snps / lineage not being able to parse your converted file is because it's trying to apply the AncestryDNA parser based on the comments, and for that it looks for whitespace between the alleles and column headers.

But, you don't need to convert the file since snps can read AncestryDNA (and the other formats discussed in the README already. Give that a try and let me know how it works.

As for the H3Africa file, snps should also be able to read that.

And if you need a VCF file, you can save the SNPs in VCF format.

Closing since there are no updates required for this issue.

Sorry, I closed the issue too early. Upon further investigation, snps should be updated to handle the H3Africa format since the generic parser is not invoked (an rsid is not in the first line). Also, the generic parser wouldn't be able to parse this due to multiple whitespace.

So to handle this, snps could either (or both)

  • check if "h3a" is in the first line and apply a parser similar to the AncestryDNA parser with multiple whitespace
  • apply a generic parser as a last check that tries to read four or five column files with multiple whitespace

Hi @lakishadavid , please try to create a new virtual environment and install lineage again - I've updated it to support the latest version of snps. FYI, here are some additional installation directions: https://lineage.readthedocs.io/en/latest/installation.html .