OftenDream

Pinned Repositories

BERT-EMD
Language:Python0 0 00
FastBERT
The score code of FastBERT (ACL2020)
Language:Python0 0 00
Knowledge_distillation_via_TF2.0
The codes for recent knowledge distillation algorithms and benchmark results via TF2.0 low-level API
Language:Python0 0 00
libtorch_tokenizer
BERT Tokenizer in C++
Language:C++0 0 00
matxscript
A high-performance, extensible Python AOT compiler.
Language:C++0 0 00
model_compression
Implementation of model compression with knowledge distilling method.
Language:Python0 0 00
models
Models and examples built with TensorFlow
Language:Python0 0 00
Pretrained-Language-Model
Pretrained language model and its related optimization techniques developed by Huawei Noah's Ark Lab.
Language:Python0 0 00
R-Drop
Language:Python0 0 00
rabbit
Deep learning models trained to correct input errors in short, message-like text
Language:Python0 0 00

OftenDream's Repositories

OftenDream/BERT-EMD
Language:Python0 0 00
OftenDream/FastBERT
The score code of FastBERT (ACL2020)
Language:Python0 0 00
OftenDream/Knowledge_distillation_via_TF2.0
The codes for recent knowledge distillation algorithms and benchmark results via TF2.0 low-level API
Language:Python0 0 00
OftenDream/libtorch_tokenizer
BERT Tokenizer in C++
Language:C++0 0 00
OftenDream/matxscript
A high-performance, extensible Python AOT compiler.
Language:C++0 0 00
OftenDream/model_compression
Implementation of model compression with knowledge distilling method.
Language:Python0 0 00
OftenDream/models
Models and examples built with TensorFlow
Language:Python0 0 00
OftenDream/Pretrained-Language-Model
Pretrained language model and its related optimization techniques developed by Huawei Noah's Ark Lab.
Language:Python0 0 00
OftenDream/R-Drop
Language:Python0 0 00
OftenDream/rabbit
Deep learning models trained to correct input errors in short, message-like text
Language:Python0 0 00
OftenDream/SimCSE
EMNLP'2021: SimCSE: Simple Contrastive Learning of Sentence Embeddings
Language:Python0 0
OftenDream/TensorRT-LLM
TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.