This repo is used to demo the feature engineering task for a data interview.
Chemical Spectra and near-infrared spectroscopy data sets in Tobacco industry.
- Infra-Red (IR), Near Infra-Red (NIR) 近红外光谱 and Raman Spectroscopy (拉曼散射光谱) theory =>特征峰的波数和强度
- Sepctral analysis in chemometrics field (化学计量学).
- Find peaks of spectrum/ PCA /KNN, for components of some samples are overlap, so PCA is not a good option
- SVM(after experiments and random checking, SVM has better performance)
- CNN for deep feature extraction, input: spectrogram(TODO)
after training I found my model is blind fool me, should return back to pre-processing of near-infrared (NIR) spectral data, for chemometrics modelling processing data is critical for model. (for time limited, I only start this task from this Sunday afternoon, much research work and code refactor is under the progress in limited deadline.)
- Classification Modeling Method for Near-Infrared Spectroscopy of Tobacco Based on Multimodal Convolution Neural Networks
- A Machine Learning Application for Classification of Chemical Spectra
- Deep Learning Models for Wireless Signal Classification with Distributed Low-Cost Spectrum Sensors
- Model-based pre-processing in Raman spectroscopy of biological samples
- Robust preprocessing and model selection for spectral data