This is a BERT large language model starter template from Banana.dev that allows on-demand serverless GPU inference.
Primary LanguagePython