ELS-RD/transformer-deploy

Efficient, scalable and enterprise-grade CPU/GPU inference server for 🤗 Hugging Face transformer models 🚀

PythonApache-2.0

Issues

convert_model command not found
#173 opened 2 years ago by pint1022
3
Unable To use batching
#180 opened a year ago by rahulmate
1
Encounter Error: ValueError: Message onnx.ModelProto exceeds maximum protobuf size of 2GB
#183 opened 10 months ago by illumination-k
0
Occasional "CUDA error cudaErrorInvalidConfiguration:invalid configuration argument" error
#166 opened 2 years ago by zoltan-fedor
1
the performance of onxx fp16 seems to be even worse than that of onnx fp32
#182 opened a year ago by lierer007
0
GPTModelWrapper' object has no attribute 'can_generate'
#177 opened a year ago by jxcomeon
1
Unable to optimize microsoft/deberta-v3-small model
#150 opened 2 years ago by hlkong323
2
unable to host .onnx model on triton server
#179 opened a year ago by riyaj8888
0
unable to do tensorrt inference
#176 opened 2 years ago by riyaj8888
0
Llama support
#170 opened 2 years ago by michaelroyzen
2
No supported GPU(s) detected to run this container
#175 opened 2 years ago by sinankefeli
0
How to deploy TrCOR
#163 opened 2 years ago by SShimmyo
2
Support converting T5 model
#122 opened 2 years ago by ayoub-louati
4
transformer-deploy on triton server 22.08
#174 opened 2 years ago by lakshaykc
1
Attempting to run T5 ORT model in Triton inference server
#157 opened 2 years ago by samiur
1
Failed to load 'transformer_onnx_model'
#172 opened 2 years ago by rifkybujana
4
Torch 2.0
#169 opened 2 years ago by varshith15
3
Tokenizer path
#171 opened 2 years ago by vishalsrao
0
Marginal Improvement Between INT8 and FP16
#168 opened 2 years ago by alexriggio
0
I faced with Stub process is unhealthy and it will be restarted.
#167 opened 2 years ago by OleksandrKorovii
0
GPT-J 6B model
#146 opened 2 years ago by timofeev1995
2
sharing the code for the triton models.
#165 opened 2 years ago by espoirMur
0
Zero copy may lead to wrong text generation results
#145 opened 2 years ago by brevity2021
6
Can we support facebook/m2m100_418M model?
#164 opened 2 years ago by chi2liu
0
ViT serving
#162 opened 2 years ago by VoVoR
0
quick question about the attribute "model_name" in `BertForTokenClassification`
#161 opened 2 years ago by yc-wang00
0
VDubbs
#160 opened 2 years ago by VDubbs86
0
Question-Answering example not working for batch_size > 1
#159 opened 2 years ago by lakshaykc
0
Support for constrained beam-search in T5
#158 opened 2 years ago by junwang-wish
0
Two GPU are slower than one
#156 opened 2 years ago by OleksandrKorovii
0
Tensorrt engine
#155 opened 2 years ago by imsiddhant07
1
Unable to install transformer-deploy module
#149 opened 2 years ago by elvinagam
4
RepositoryNotFoundError when optimizing private HF model
#148 opened 2 years ago by Matthieu-Tinycoaching
0
OpusMT conversion
#143 opened 2 years ago by Matthieu-Tinycoaching
1
Embedding with T5-Encoder
#153 opened 2 years ago by DA-L3
1
How to create custom image with transformers pre-installed?
#147 opened 2 years ago by tamaghna-dutta
0
encoder_hidden_states in the onnx inputs
#139 opened 2 years ago by pngmafia
3
TypeError: unhashable type: 'slice'
#136 opened 2 years ago by pngmafia
2
[Question] Converting models over 2GB
#138 opened 2 years ago by wkkautas
3
Does the text generation instructions work for the ByT5 models? (byte-by-byte T5)?
#140 opened 2 years ago by NOT-HAL9000
1
Using t5-large in t5 notebook, the translation result is invalid
#135 opened 2 years ago by brevity2021
1
Failed to load private model
#132 opened 2 years ago by Matthieu-Tinycoaching
5
t5 notebook broken with transformer-deploy 0.5.0
#130 opened 2 years ago by michaelroyzen
3
[Question] Documentation for generative model API and parameters?
#129 opened 2 years ago by tanmayb123
1
[Question] Batch inference for classification models
#128 opened 2 years ago by denisov-vlad
3
Unable to pull docker image
#127 opened 2 years ago by brevity2021
1
TypeError: deserialize_cuda_engine(): incompatible function arguments.
#126 opened 2 years ago by harishprabhala
0
GPU quantization for sentence-transformer: ONNX quantized model
#124 opened 2 years ago by Matthieu-Tinycoaching
3
How to Run With Polygraphy Graph Surgeon
#121 opened 2 years ago by sam-h-bean
5
Question about generative model notebook
#123 opened 2 years ago by hyunwoongko
1