
Split GeMMA clusters into functionally coherent alignments (FunFams)

Primary LanguagePerl



This is one of the repos that's used in generating FunFams.

This repo splits GeMMA clusters into functionally coherent alignments (FunFams).

For the master repo, please see https://github.com/UCL/cath-funfam

See the FunFHMMer Wiki for documentation.


  1. Pre-processing the data based on current gemma output when a folder dir is given

The FunFHMMer algorithm

The FunFHMMer algorithm is used to identify functional families in protein domain superfamilies by determining an optimal cut of a hierarchical clustering superfamily tree of sequence relatives by calculating a novel functional coherence index based on conserved positions and specificity-determining positions (SDPs) in sequence alignments [1]. FunFHMMer was used to generate FunFams in the CATH-Gene3D resource (v4.0). FunFams generated by FunFHMMer were shown to generate a more functionally coherent grouping of domain sequences than the other domain classifications. Moreover, it was also shown that the FunFHMMer algorithm is not limited in its use to CATH but can also be used to sub-classify other widely used domain-based classification resources such as Pfam.

Relevant Papers

  1. Functional classification of CATH superfamilies: a domain-based approach for protein function annotation

  2. CATH FunFHMMer web server: protein functional annotations using functional family assignments