Built ANN using Pytorch

TL;DR Built ANN using pytorch to predict if it is going to rain tomorrow or not going to rain tomorrow

Data

Provided in the repo, it is AUS daily wheather data with target label know as RainTomorrow

Some things to note

The network.gz.pdf is visualization of how the network looks
Only 2 hidden layers are being utilized
Output is applied with sigmoid function in order to return 1 or 0

Performance

              precision    recall  f1-score   support

     No rain       0.85      0.96      0.90     19413
     Raining       0.74      0.40      0.52      5525

    accuracy                           0.84     24938
   macro avg       0.80      0.68      0.71     24938
weighted avg       0.83      0.84      0.82     24938

epoch 0
        Train set - loss: 2.513, accuracy: 0.779
        Test  set - loss: 2.517, accuracy: 0.778
        
epoch 100
        Train set - loss: 0.457, accuracy: 0.792
        Test  set - loss: 0.458, accuracy: 0.793
        
epoch 200
        Train set - loss: 0.435, accuracy: 0.801
        Test  set - loss: 0.436, accuracy: 0.8
        
epoch 300
        Train set - loss: 0.421, accuracy: 0.814
        Test  set - loss: 0.421, accuracy: 0.815
        
epoch 400
        Train set - loss: 0.412, accuracy: 0.826
        Test  set - loss: 0.413, accuracy: 0.827
        
epoch 500
        Train set - loss: 0.408, accuracy: 0.831
        Test  set - loss: 0.408, accuracy: 0.832
        
epoch 600
        Train set - loss: 0.406, accuracy: 0.833
        Test  set - loss: 0.406, accuracy: 0.835
        
epoch 700
        Train set - loss: 0.405, accuracy: 0.834
        Test  set - loss: 0.405, accuracy: 0.835
        
epoch 800
        Train set - loss: 0.404, accuracy: 0.834
        Test  set - loss: 0.404, accuracy: 0.835
        
epoch 900
        Train set - loss: 0.404, accuracy: 0.834
        Test  set - loss: 0.404, accuracy: 0.836

Issues

Prediction for rain is not good for recall as the number of data sample for rain is only 20% and not rain has 80% (bias factor)
Current data preprocessing only able to push accuracy up to 84%, even with more epochs

How to run

Clone this repo
Make sure you have Pytorch installed
Run main.py
Optimize and have fun

fadilparves/pytorch-ann

Built ANN using Pytorch

Data

Some things to note

Performance

Issues

How to run