Some tests with the transformers library

Primarily focussing on inference on user setup with a 16 GB graphic card.

Text generation with Falcon/7B/Instruct

python falcon_test.py --test-gen --bitwidth (8|32)

The text generation appears to be 4x facter in 8-bit on an NVIDIA Tesla T4