Accurate peak detection is essential for analyzing high-throughput dataset generated by analytical instruments. Derivative with noise reduction and matched filtration are frequently used, but it sensitive to baseline variations, random noise and deviations in peak shape. Continuous wavelet transform (CWT)-based method is more practical and popular in this situation, which can increase accuracy and reliability by identifying peaks across scales in wavelet space and implicit removal of noise and baseline. However, its computational load is relatively high and the estimated features of peak may not be accurate in the case of peaks such as overlapping, dense and weak peaks. In this study, we present multi-scale peak detection (MSPD) for peak detection by taking full advantage of additional information in wavelet space including ridge, valley, zero-crossing. It can achieve high accuracy by thresholding detected peak with the maximum of its ridge. Furthermore, MSPD has been designed and implemented efficiently in Python and cython, which is significantly faster than MassSpecWavelet. It is particularly suitable for detecting peaks in high-throughput and hyphenated datasets. It has been comprehensively evaluated with both simulated proteomics spectra and Romanian database of raman spectroscopy. Receiver operating characteristic (ROC) curves show that MSPD can detect more true peaks while keeping false discovery rate lower than existing methods. Superior results in Raman spectra suggest that MSPD is a more universal method for peak detection.
Python 2.7 is recommended.
- Numpy
- Scipy
- Matplotlib
- Cython
Zhi-Min Zhang: zmzhang@csu.edu.cn