Search paralog ortholog database to construct multigene families in primates
Two lists of 2-paralog multigene families:
-
A less careful list that combining Duplicated Genes Database and OrthoMAM Database. This list has 113 pairs of genes.
-
List from above being selected with metaPhors database. This list has 8 pairs of genes left.
DNA CDS sequences were obtained from OrthoMAM v9 that contain 5 primate species:
- Homo sapiens
- Pan troglodytes
- Gorilla gorilla
- Pongo abelii
- Macaca mulatta
Now, add Callithrix (NCBI taxonomy id 9583) as outgroup. The two lists after this step have:
- Less accurate list has 104 pairs of genes left.
- More careful list has 7 pairs of genes left.
Note:
In STEAP4_STEAP2.fasta, MacacaSTEAP4 was added G in the end of sequence to complete the stop codon manually.
CNTN6 CNTN4 pair was removed.
In IGSF3_CD101.fasta, PongoIGSF3 was added A in the end of sequence to complete the stop codon manually.