/snipsa

Tools for analyzing raw DNA test data files. Shows Y chromosome and mitochondrial mtDNA haplogroups. LGPLv3 - use freely but share improvements.

Primary LanguagePythonGNU Lesser General Public License v3.0LGPL-3.0

SNIPSA

A small project experimenting with SNP genome data and python.

snipsa-gui.py

Windows: unzip the windows release package and start snipsa.bat in the main folder. The zip includes python, all dependencies and SNP databases. Thanks to TPN for the preinstalled python package.

Linux: install dependencies and run snipsa-gui.py.

Experimental bam alignment file support is now enabled by default.

FTDNA project files can be imported. Go to project DNA Results->Classic Chart, set Page Size to a big number, load the new page and Save as.

haploy_find.py

This small tool reads a raw SNP data file and lists Y chromosome haplogroup information. You must first initialize the mutation database with haploy_db_import.py. The tool lists all the haplogroup related mutations found in the database, sorted roughly by age of the mutation.

./haploy_find.py <file>

The tool accepts also multiple files. This can be used to search haplogroups among the files, using the common haplogroup name (separated by comma) or mutation name as the search key.

./haploy_find.py N1a1,I2,L287 <files>...

haploy_db_import.py

This script imports the ISOGG database to the internal format that is used by haploy_find.py. It needs CrossMap (pip3 install CrossMap), conversion chain file (crossmap/GRCh38_to_NCBI36.chain.gz) and the ISOGG spreadsheet in csv format ('SNP Index - Human.csv'). Outputs a haploy_map.txt file which is used by haploy_find.py. See haploy.py for details on other required input files.

haploy_anno_import.py

This example script is used to import your own annotations that can be attached to the reported tree nodes. As an example the script will import YFull person IDs, and also some selected FTDNA project files. Open project chart tables in a browser, select page size so that everything fits in one page, and edit the corresponding lines at the end of the script. Snipsa will load any files starting with haploy_annodb. An example file is included.

haplomt_find.py

This small tool reads a raw SNP data file and lists MT chromosome haplogroup information. You must first initialize the mutation database with haplomt_db_import.py. The tool finds paths in the mutation tree and displays best matches.

./haplomt_find.py <file>

The tool accepts also multiple files. This can be used to search haplogroups among the files, using the common haplogroup name (separated by comma) or mutation name as the search key.

./haplomt_find.py 0 <files>...

haplomt_db_import.py

This script imports the PhyloTree.org mtDNA database file in (https://www.phylotree.org/builds/mtDNA_tree_Build_17.zip) to the internal format. Experimental support for yfull.com mtree import. Outputs a haplomt_map.txt file which is used by haplomt_find.py. See haplomt.py for details.