Attempt at replicating the paper described in this pdf:

MLIRX repo used

Their benchmark codes

Cuda Device with no tesnor core local

My machine has a GTX1650 which does not have tensor cores so I need to try to replicate on Google Colab, Kaggle or other cloud service, ie. GCP, Azure, AWS, etc.