Deep-neural-network-with-limiter-precision-on-GPUs-using-python

Deep neural network with limiter precision on GPUs using python

Introduction

Deep learning (also known as deep structured learning or hierarchical learning) is part of a broader family of machine learning methods based on learning data representations, as opposed to task-specific algorithms. Learning can be supervised, partially supervised or unsupervised. The training of deep neural networks is very often limited by hardware. Training of large-scale deep neural networks is often constrained by the available computational resources. We will going study the effect of limited precision data representation and computation on neural network training. Within the context of low precision floating-point computations, we will observe the rounding scheme to play a crucial role in determining the network’s behavior during training. Lots of previous works address the best exploitation of general-purpose hardware, typically CPU clusters (1), FPGAs and GPUs (2). We are planning to train a set for state of art neural network on various benchmark MNIST dataset. They will be get trained for distinct formats like floating point (16, 32, 64 bit). For each dataset and for each those formats we will going to train them and test for each precision (like half precision, single precision and double precision).

Conclusion

In this project tried to check out if lower precision will be good or bad for deep neural network and their algorithms. Specifically, for lower precision that is for 16 bit. Additionally, we implement high throughput and energy efficient architecture for matrix multiplication. Which gives us failure for int8 precision bit. Our results show that deep networks can be trained using only 16-bit wide floating-point number representation when using typecasting the float values at 32 and 64, and incur little to no degradation in the classification accuracy, power consumption and memory utilization. If I pursue this project in future I will try to run this model by varying the precision at float8 even lower. Planning to run and analyze the model for more datasets.