Predicting cleaved peptides in protein sequences.
- Precompute embeddings using
src/utils/make_embeddings.py
- Train the model
python3 run.py --embeddings_dir PATH/TO/EMBEDDINGS -df data/labeled_sequences.csv -pf data/graphpart_assignments.csv
Note that parameters --lr
, --batch_size
, --dropout
, --conv_dropout
, --kernel_size
, --num_filters
, --hidden_size
were optimized in a nested CV hyperparameter search and not used at their defaults.
- PeptideLocator was evaluated as a licensed executable and cannot be provided in this repo.
- We used 5-fold nested CV to select 20 model checkpoints trained using
src/train_loop_crf.py
. The selected checkpoints are hardcoded inevaluation/measure_performance.py
, which computes the performance metrics from the checkpoints' saved predictions.