We use Pindel, SVseq2, BreakDancer and DELLY to get raw callset and merge it as candidates.
Per line of deletions: chromosom, start, end+1
- Need bai for each bam file, and samtools in
$PATH
- Output finename is
bam_filename_normalized
andbam_filename_absolute
. The former is used for training model. - Output format: feature1, feature2.......
- For example:
./Concod -e 0.5 -m 1000 -b test.list
We use LIBSVM for training and testing SVM model.
- Add label for feature as training_data and then
formatDataToLibsvm
. - Find the Optimal parameters and train model:
python easy.py training_data
andsvm-train
. - There are two demo model: lowCov_model and hignCov_model
svm-predict testing_data.scale model predictResult
- Use the deletions of label with "1" as the final results.