EM algorithm to group protein sequences
Grouping protein sequences by running EM algorithm.
Running script using sample input : python run_em.py -i subseqs10.txt
- --input, -i : input file containing protein sequences
- --seqlen, -l : Length of sequences to use, (default: length of actual input sequence)
- --numfamily, -nf: Number of groups or families (default: 5)
- --iters, -ii: Number of iterations to run (default: 100)
- --savefreq, -sf: Save frequency of algorithm results (default: 10)
Sample output file : subseqs10_iter0.txt
- class_label / soft_membership