Pinned Repositories
1DCNN
Explanation of 1D CNN
1DCNN-1
1DCNN
1DCNN-2
一维卷积神经网络
1dcnn-3
1DCNN-4
1dcnn-5
1DCNN_Classifier
采用试井压力导数曲线作为训练数据进行油藏模型分类,通过keras构造分类模型
CNN_GRU-Regression
This project use CNN+GRU in tensorflow1.x/python to implement regression about time_series.The main content is to predict the wind power at the current time based on the wind speed and wind power data at the historical time。
Credit-Default-via-Deep-Learning
Using DNN/LSTM/1D-CNN to analyze a credit default problem
SubstancePredication
# Packages installed 1. Anaconda (conda environment with python 3.6) 2. Keras (conda install -c -conda-forge keras) 3. SciKit-learn 4. Pandas 5. Matpotlib 6. NumPy 7. NLTK 8. Wordcloud # Approach I tried to implement 2 approaches, namely a linear neural network model (Sequential) (model 1) and Convolutional Neural Networks (CNN) (model 2), both using Keras libraries. I have commented the code wherever needed and explained the different strategies I tried during the course of this exercise. I prepared 3 datasets: 1. Just the sentence and label columns 2. Subject + Predicate + Object and label columns 3. Both of the above combined. I ran all three datasets and #3 performed better as compared to the other 2. There were key differences in data preparation for both models. In the case of the neural network, I tokenized and vectorized tokens using word2vec from Google News dataset(1x300 array per word) which has information regarding words being contextually relevant. Further, they were weighted by 'term frequency - inverse document frequency'(tf-idf), added together and divided by the count of words. Hence the 'mean' (so-to-speak) of all words of a sentence was calculated forming the word vector of that sentence. Extending this to all rows, the input was of the order: (number of input rows x vector dimensionality of word2vec) In terms of CNN, the input was also tokenized using the Keras library 'Tokenizer' this time and fit the sentences (row) iteratively. Instead of passing the 'mean' of the word vectors, I passed the vectors of a given sentence. I then zero-padded the vectors of sentences until the length was equal to that of the longest amongst the sentences in the input set. Hence the input was of the order: (number of input rows x vector size of the longest sentence) The embedding matrix was the word2vec mapping of all tokens in the input corpus and hence the order of this matrix was: (number of unique tokens x vector dimensionality of word2vec) In both cases, the train-test data split was 80%-20% respectively. # Results I plotted the history of accuracy and losses of the model predictions. Both models yielded around 60% (+/- 3%) accuracy with training and testing. There seems to be no overfitting/underfitting. Metrics for testing sets were as follows: ### Model 1 (Sequential) 1. Accuracy: 0.5833 2. Precision: 0.5548 3. Recall: 0.8571 4. F-score: 0.6736 ### Model 2 (CNN) 1. Accuracy: 0.6233 2. Precision: 0.6056 3. Recall: 0.7143 4. F-score: 0.6555 These results are not terrible but there's room for improvement with hyperparameter tuning and design tweaks for improved metrics. Also, increasing the data volume may result in better metrics. Transfer learning may also be a good option for data this small. # Limitations The metrics showing the model performance could be improved by using more data and tuning hyperparameters. One strategy I skipped is k-fold cross-validation. Implementing k-fold cross-validation may have high variability according to one study and alternatively, they suggest using a new technique called J-K-fold cross-validations to reduce variance during training (https://www.aclweb.org/anthology/C18-1252.pdf). Another strategy I skipped was performing a grid search to arrive at optimized hyperparameter values something that was done by Yoon Kim et.al.. Training deep learning classifiers with a small dataset may not be reliable. Transfer learning may be a better option. Other libraries like fastai (https://docs.fast.ai/) which is a wrapper over PyTorch could be an alternative. They implement sophisticated techniques like 'LR Finder' that helps users make informed decisions on choosing learning rates for optimizers (SGD, Adam or RAdam). They also implement transfer learning wherein an already trained classifier model (on a variety of corpora) is implemented which proves to be effective. They implement advanced Recurrent Neural Network (RNN) strategies. This could be explored in future work.
goodpupil's Repositories
goodpupil/Credit-Default-via-Deep-Learning
Using DNN/LSTM/1D-CNN to analyze a credit default problem
goodpupil/SubstancePredication
# Packages installed 1. Anaconda (conda environment with python 3.6) 2. Keras (conda install -c -conda-forge keras) 3. SciKit-learn 4. Pandas 5. Matpotlib 6. NumPy 7. NLTK 8. Wordcloud # Approach I tried to implement 2 approaches, namely a linear neural network model (Sequential) (model 1) and Convolutional Neural Networks (CNN) (model 2), both using Keras libraries. I have commented the code wherever needed and explained the different strategies I tried during the course of this exercise. I prepared 3 datasets: 1. Just the sentence and label columns 2. Subject + Predicate + Object and label columns 3. Both of the above combined. I ran all three datasets and #3 performed better as compared to the other 2. There were key differences in data preparation for both models. In the case of the neural network, I tokenized and vectorized tokens using word2vec from Google News dataset(1x300 array per word) which has information regarding words being contextually relevant. Further, they were weighted by 'term frequency - inverse document frequency'(tf-idf), added together and divided by the count of words. Hence the 'mean' (so-to-speak) of all words of a sentence was calculated forming the word vector of that sentence. Extending this to all rows, the input was of the order: (number of input rows x vector dimensionality of word2vec) In terms of CNN, the input was also tokenized using the Keras library 'Tokenizer' this time and fit the sentences (row) iteratively. Instead of passing the 'mean' of the word vectors, I passed the vectors of a given sentence. I then zero-padded the vectors of sentences until the length was equal to that of the longest amongst the sentences in the input set. Hence the input was of the order: (number of input rows x vector size of the longest sentence) The embedding matrix was the word2vec mapping of all tokens in the input corpus and hence the order of this matrix was: (number of unique tokens x vector dimensionality of word2vec) In both cases, the train-test data split was 80%-20% respectively. # Results I plotted the history of accuracy and losses of the model predictions. Both models yielded around 60% (+/- 3%) accuracy with training and testing. There seems to be no overfitting/underfitting. Metrics for testing sets were as follows: ### Model 1 (Sequential) 1. Accuracy: 0.5833 2. Precision: 0.5548 3. Recall: 0.8571 4. F-score: 0.6736 ### Model 2 (CNN) 1. Accuracy: 0.6233 2. Precision: 0.6056 3. Recall: 0.7143 4. F-score: 0.6555 These results are not terrible but there's room for improvement with hyperparameter tuning and design tweaks for improved metrics. Also, increasing the data volume may result in better metrics. Transfer learning may also be a good option for data this small. # Limitations The metrics showing the model performance could be improved by using more data and tuning hyperparameters. One strategy I skipped is k-fold cross-validation. Implementing k-fold cross-validation may have high variability according to one study and alternatively, they suggest using a new technique called J-K-fold cross-validations to reduce variance during training (https://www.aclweb.org/anthology/C18-1252.pdf). Another strategy I skipped was performing a grid search to arrive at optimized hyperparameter values something that was done by Yoon Kim et.al.. Training deep learning classifiers with a small dataset may not be reliable. Transfer learning may be a better option. Other libraries like fastai (https://docs.fast.ai/) which is a wrapper over PyTorch could be an alternative. They implement sophisticated techniques like 'LR Finder' that helps users make informed decisions on choosing learning rates for optimizers (SGD, Adam or RAdam). They also implement transfer learning wherein an already trained classifier model (on a variety of corpora) is implemented which proves to be effective. They implement advanced Recurrent Neural Network (RNN) strategies. This could be explored in future work.
goodpupil/Alzheimers-DL-Network
A CNN-LSTM deep learning model for prognostic prediction and classification of Alzheimer's MRI neuroimages.
goodpupil/AutomaticWeightedLoss
Multi-task learning using uncertainty to weigh losses for scene geometry and semantics, Auxiliary Tasks in Multi-task Learning
goodpupil/CNN-binaryClassification-UCI-datset
UCI dataset을 활용해 CNN모델구축후 K-fold 기법으로 평가
goodpupil/cnn-lstm
CNN LSTM architecture implemented in Pytorch for Video Classification
goodpupil/CNN_system
Keywords: CNN, Fully connect neural network, SFEW dataset, Image Preprocessing, Data Augmentation, Leakey ReLU, k-fold cross validation, Casper. In this project, I build my own CNN system with Image Preprocessing and Data Augmentation which are based on the computation ability and characteristic of used Dataset. This project implemented with Pytorch.
goodpupil/Deep-Learning-with-TensorFlow-book
深度学习入门开源书,基于TensorFlow 2.0案例实战。Open source Deep Learning book, based on TensorFlow 2.0 framework.
goodpupil/DeepLearningPractice
about deep learning projects
goodpupil/DeepLearningTutorial
Talk is cheap,show me the code ! Deep Learning,Leaning deep,Have fun!
goodpupil/DrumClassifer-CNN-LSTM
Classifies percussion audio samples with a CNN-LSTM, written in python and pytorch. Also exports to Drumkv1 (lv2 plugin)
goodpupil/examples
A set of examples around pytorch in Vision, Text, Reinforcement Learning, etc.
goodpupil/HAR_Pytorch
Human Activity Recognition using pytorch CNN & LSTM
goodpupil/image-captioning
Used deep learning to train a CNN + RNN/LSTM on the MS-COCO dataset to automatically generate captions.
goodpupil/imgaug
Image augmentation for machine learning experiments.
goodpupil/PhyCNN
Physics-guided Convolutional Neural Network
goodpupil/pytorch
Tensors and Dynamic neural networks in Python with strong GPU acceleration
goodpupil/pytorch-cifar100
Practice on cifar100(ResNet, DenseNet, VGG, GoogleNet, InceptionV3, InceptionV4, Inception-ResNetv2, Xception, Resnet In Resnet, ResNext,ShuffleNet, ShuffleNetv2, MobileNet, MobileNetv2, SqueezeNet, NasNet, Residual Attention Network, SENet)
goodpupil/PyTorch-Networks
Pytorch implementation of cnn network
goodpupil/PyTorch-Tutorial-1
Build your neural network easy and fast
goodpupil/pytorch_geometric
Geometric Deep Learning Extension Library for PyTorch
goodpupil/pytorch_resnet_cifar10
Proper implementation of ResNet-s for CIFAR10/100 in pytorch that matches description of the original paper.
goodpupil/ResNeXt.pytorch
Reproduces ResNet-V3 with pytorch
goodpupil/skorch
A scikit-learn compatible neural network library that wraps PyTorch
goodpupil/Speech_Signal_Processing_and_Classification
Front-end speech processing aims at extracting proper features from short- term segments of a speech utterance, known as frames. It is a pre-requisite step toward any pattern recognition problem employing speech or audio (e.g., music). Here, we are interesting in voice disorder classification. That is, to develop two-class classifiers, which can discriminate between utterances of a subject suffering from say vocal fold paralysis and utterances of a healthy subject.The mathematical modeling of the speech production system in humans suggests that an all-pole system function is justified [1-3]. As a consequence, linear prediction coefficients (LPCs) constitute a first choice for modeling the magnitute of the short-term spectrum of speech. LPC-derived cepstral coefficients are guaranteed to discriminate between the system (e.g., vocal tract) contribution and that of the excitation. Taking into account the characteristics of the human ear, the mel-frequency cepstral coefficients (MFCCs) emerged as descriptive features of the speech spectral envelope. Similarly to MFCCs, the perceptual linear prediction coefficients (PLPs) could also be derived. The aforementioned sort of speaking tradi- tional features will be tested against agnostic-features extracted by convolu- tive neural networks (CNNs) (e.g., auto-encoders) [4]. The pattern recognition step will be based on Gaussian Mixture Model based classifiers,K-nearest neighbor classifiers, Bayes classifiers, as well as Deep Neural Networks. The Massachussets Eye and Ear Infirmary Dataset (MEEI-Dataset) [5] will be exploited. At the application level, a library for feature extraction and classification in Python will be developed. Credible publicly available resources will be 1used toward achieving our goal, such as KALDI. Comparisons will be made against [6-8].
goodpupil/ssqueezepy
Synchrosqueezing, wavelet transforms, and time-frequency analysis in Python
goodpupil/SST
Understanding Synchrosqueezing Transform
goodpupil/Statistical-Learning-Method_Code
手写实现李航《统计学习方法》书中全部算法
goodpupil/Two-stream-CNN-for-rolling-bear-fault-diagnosis
Based on the dual-flow CNN, a new bearing fault diagnosis model is proposed. The model is composed of 2D-CNN and 1D-CNN. Among them, 2D-CNN takes wavelet time-frequency map as input, and 1D-CNN takes original vibration signal as input. After the feature extraction is implemented by the convolutional layer and the pooling layer, the output of the pooling layer of the two is spliced using a fully connected layer, and then the fault classification is achieved through the fully connected layer
goodpupil/vision
Datasets, Transforms and Models specific to Computer Vision