goodpupil

deep learning

Pinned Repositories

1DCNN
Explanation of 1D CNN
Language:Jupyter Notebook00
1DCNN-1
1DCNN
Language:Jupyter Notebook00
1DCNN-2
一维卷积神经网络
Language:Python00
1dcnn-3
Language:Python00
1DCNN-4
Language:Python00
1dcnn-5
0 0 00
1DCNN_Classifier
采用试井压力导数曲线作为训练数据进行油藏模型分类，通过keras构造分类模型
Language:Python00
CNN_GRU-Regression
This project use CNN+GRU in tensorflow1.x/python to implement regression about time_series.The main content is to predict the wind power at the current time based on the wind speed and wind power data at the historical time。
2 0 00
Credit-Default-via-Deep-Learning
Using DNN/LSTM/1D-CNN to analyze a credit default problem
Language:Jupyter Notebook40
SubstancePredication
# Packages installed 1. Anaconda (conda environment with python 3.6) 2. Keras (conda install -c -conda-forge keras) 3. SciKit-learn 4. Pandas 5. Matpotlib 6. NumPy 7. NLTK 8. Wordcloud # Approach I tried to implement 2 approaches, namely a linear neural network model (Sequential) (model 1) and Convolutional Neural Networks (CNN) (model 2), both using Keras libraries. I have commented the code wherever needed and explained the different strategies I tried during the course of this exercise. I prepared 3 datasets: 1. Just the sentence and label columns 2. Subject + Predicate + Object and label columns 3. Both of the above combined. I ran all three datasets and #3 performed better as compared to the other 2. There were key differences in data preparation for both models. In the case of the neural network, I tokenized and vectorized tokens using word2vec from Google News dataset(1x300 array per word) which has information regarding words being contextually relevant. Further, they were weighted by 'term frequency - inverse document frequency'(tf-idf), added together and divided by the count of words. Hence the 'mean' (so-to-speak) of all words of a sentence was calculated forming the word vector of that sentence. Extending this to all rows, the input was of the order: (number of input rows x vector dimensionality of word2vec) In terms of CNN, the input was also tokenized using the Keras library 'Tokenizer' this time and fit the sentences (row) iteratively. Instead of passing the 'mean' of the word vectors, I passed the vectors of a given sentence. I then zero-padded the vectors of sentences until the length was equal to that of the longest amongst the sentences in the input set. Hence the input was of the order: (number of input rows x vector size of the longest sentence) The embedding matrix was the word2vec mapping of all tokens in the input corpus and hence the order of this matrix was: (number of unique tokens x vector dimensionality of word2vec) In both cases, the train-test data split was 80%-20% respectively. # Results I plotted the history of accuracy and losses of the model predictions. Both models yielded around 60% (+/- 3%) accuracy with training and testing. There seems to be no overfitting/underfitting. Metrics for testing sets were as follows: ### Model 1 (Sequential) 1. Accuracy: 0.5833 2. Precision: 0.5548 3. Recall: 0.8571 4. F-score: 0.6736 ### Model 2 (CNN) 1. Accuracy: 0.6233 2. Precision: 0.6056 3. Recall: 0.7143 4. F-score: 0.6555 These results are not terrible but there's room for improvement with hyperparameter tuning and design tweaks for improved metrics. Also, increasing the data volume may result in better metrics. Transfer learning may also be a good option for data this small. # Limitations The metrics showing the model performance could be improved by using more data and tuning hyperparameters. One strategy I skipped is k-fold cross-validation. Implementing k-fold cross-validation may have high variability according to one study and alternatively, they suggest using a new technique called J-K-fold cross-validations to reduce variance during training (https://www.aclweb.org/anthology/C18-1252.pdf). Another strategy I skipped was performing a grid search to arrive at optimized hyperparameter values something that was done by Yoon Kim et.al.. Training deep learning classifiers with a small dataset may not be reliable. Transfer learning may be a better option. Other libraries like fastai (https://docs.fast.ai/) which is a wrapper over PyTorch could be an alternative. They implement sophisticated techniques like 'LR Finder' that helps users make informed decisions on choosing learning rates for optimizers (SGD, Adam or RAdam). They also implement transfer learning wherein an already trained classifier model (on a variety of corpora) is implemented which proves to be effective. They implement advanced Recurrent Neural Network (RNN) strategies. This could be explored in future work.
Language:Jupyter Notebook1 0 03

goodpupil's Repositories

goodpupil/1dcnn-5
0 0 00
goodpupil/1DCNN_NLP_ASS1
Implementation of One Dimension Convolution Neural Network for Prediction of Housing Data
Language:Jupyter Notebook00
goodpupil/1dCnn_Non_linear_regression
Language:Jupyter Notebook00
goodpupil/CIFAR-ZOO
PyTorch implementation of CNNs for CIFAR benchmark
goodpupil/cnn-text-classification-tf
Convolutional Neural Network for Text Classification in Tensorflow
goodpupil/CNN_ATTENTION_models
CNN with attention and CNN_LSTM_attention(pytorch)
goodpupil/CNN_TimeSeriesClassification
goodpupil/CNN_TS
Time series with convolutional neural network
goodpupil/CWT_CNN
Classification of multiple non-stationary time-series by using Continuous Wavelet Transformation
goodpupil/Data_Augmentation-3
Data Augmentation to increase base data size
goodpupil/DeepEEGDataAugmentation
Code for processing EEG data with Riemannian and deep learning-based classifiers. Additionally provides methods for data augmentation including intentionally imbalancing a dataset, and appending modified data to the training set.
goodpupil/End-to-end-Sequence-Labeling-via-Bi-directional-LSTM-CNNs-CRF-Tutorial
Tutorial for End-to-end Sequence Labeling via Bi-directional LSTM-CNNs-CRF
goodpupil/GAN_Time_Series
A model to generate time series data with the purpose of augmenting a dataset of various time series.
goodpupil/General-Purpose-GenP-Bioimage-Ensemble-of-Handcrafted-and-Learned-Features-with-Data-Augmentation
General Purpose (GenP) Bioimage Ensemble of Handcrafted and Learned Features with Data Augmentation
goodpupil/K-fold-CNN-using-Keras
K-fold-CNN for forecasting buyer transaction
goodpupil/keras-data-augmentation-res
goodpupil/LSTM-Neural-Network-for-Time-Series-Prediction
LSTM built using Keras Python package to predict time series steps and sequences. Includes sin wave and stock market data
goodpupil/NLP-1DCNN
assignment 1
goodpupil/Nlp_Assignment1_nonlinear_1dCNN
goodpupil/pos_tagging
Implementation of POS (part-of-speech) tagger with CNN and LSTM using PyTorch
goodpupil/pwc
Papers with code. Sorted by stars. Updated weekly.
goodpupil/PyTorchDocs
PyTorch 官方中文教程包含 60 分钟快速入门教程，强化教程，计算机视觉，自然语言处理，生成对抗网络，强化学习。欢迎 Star，Fork！
goodpupil/Time-Series-Forecasting-using-NN-LSTM-and-CNN
Predicted a stock price close of a day based on the last 7 day’s time series data using Neural Network, LSTM and CNN. Found the best number of the days should be considered in the past that yield the best model. Also, used LSTM to predict the stock prices for a company like Google and Apple for a continuous 5 days period.
goodpupil/time_series_classification_and_ensembles
some examples for time series classification using keras: #1D_CNN #LSTM #Dense #Ensembles
goodpupil/Time_Series_Prediction_TF
Stock prediction using TensorFlow; utilizing methods of LSTM, DNN and CNN.
goodpupil/TimeSeries-CNN
In this project I developed Convolutional Neural Network models for univariate , multivariate , multi-step time series forecasting.
goodpupil/towerDataAugment
我的数据增强备份
goodpupil/video_analysis
ML Pytorch - Model for video analysis and video classification based on encoder(CNN)-decoder(LSTM) architecture.
goodpupil/Visual-Question-Answering
CNN+LSTM, Attention based, and MUTAN-based models for Visual Question Answering
goodpupil/wide-resnet.pytorch
Best CIFAR-10, CIFAR-100 results with wide-residual networks using PyTorch