/spectral_classifier

Primary LanguageJupyter NotebookMIT LicenseMIT

spectral_classifier

This repo is used to demo the feature engineering task for a data interview.

Data source

Chemical Spectra and near-infrared spectroscopy data sets in Tobacco industry.

  • Infra-Red (IR), Near Infra-Red (NIR) 近红外光谱 and Raman Spectroscopy (拉曼散射光谱) theory =>特征峰的波数和强度
  • Sepctral analysis in chemometrics field (化学计量学).

Experiments, typical for classification modeling

  • Find peaks of spectrum/ PCA /KNN, for components of some samples are overlap, so PCA is not a good option
  • SVM(after experiments and random checking, SVM has better performance)
  • CNN for deep feature extraction, input: spectrogram(TODO)

after training I found my model is blind fool me, should return back to pre-processing of near-infrared (NIR) spectral data, for chemometrics modelling processing data is critical for model. (for time limited, I only start this task from this Sunday afternoon, much research work and code refactor is under the progress in limited deadline.)

Sample wavelength of Absorbance(AU).

Reference