Aim of this course project is to create preprocessing the pipeline for ECG classification: normal or premature ventricular contraction (PVC).
Raw data represented as sequence of samples saved in a CSV file. Figure below shows diffirent parts of that data:
Methods in preprocessor file used to slices QRS complexes, align them by isoline and smooth high frequency components. After preprocessing step, data looks like shown below.
Two approaches have been tested: filtration with Butterworth and Savitzky–Golay filter.
Signal for both ways looks pretty the same, so it was decided to leave only one filter.
Following models have been tested:
- Logistic regression
- KNN
- XGBoost
Best results have been obtained in KNN model. Diagrams shows below.
Features heatmap:
Confusion Matrix: