- kunitz_struct_align_clean.fasta is the MSA file coming from PDBe Fold with which kunitz_model.hmm has been built.
- pdbefold_results.dat contains some data about PDBe Fold run with the pdb_chain.txt identifier list.
- set1.txt and set2.txt are the two subsets from which the optimal threshold has been computed.
- entry_eval.class is the file with all SwissProt entries (positives + negatives) with the relative e-value.
- performance.py calculates MCC and confusion matrix.
- positive_id_list.txt is a list of the sequences (w/o traning sequences) that must be given in input while running performance.py.
- common_sequence_remover.py is used to remove training FASTA sequences from the positive dataset.