/BERT-LoRA-TensorRT

This repository contains a custom implementation of the BERT model, fine-tuned for specific tasks, along with an implementation of Low Rank Approximation (LoRA). The models are optimized for high performance using NVIDIA's TensorRT.

Primary LanguageJupyter NotebookApache License 2.0Apache-2.0

Stargazers