ELS-RD/transformer-deploy
Efficient, scalable and enterprise-grade CPU/GPU inference server for 🤗 Hugging Face transformer models 🚀
PythonApache-2.0
Issues
- 3
convert_model command not found
#173 opened by pint1022 - 1
Unable To use batching
#180 opened by rahulmate - 0
Encounter Error: ValueError: Message onnx.ModelProto exceeds maximum protobuf size of 2GB
#183 opened by illumination-k - 1
Occasional "CUDA error cudaErrorInvalidConfiguration:invalid configuration argument" error
#166 opened by zoltan-fedor - 0
- 1
- 2
Unable to optimize microsoft/deberta-v3-small model
#150 opened by hlkong323 - 0
unable to host .onnx model on triton server
#179 opened by riyaj8888 - 0
unable to do tensorrt inference
#176 opened by riyaj8888 - 2
Llama support
#170 opened by michaelroyzen - 0
- 2
How to deploy TrCOR
#163 opened by SShimmyo - 4
Support converting T5 model
#122 opened by ayoub-louati - 1
transformer-deploy on triton server 22.08
#174 opened by lakshaykc - 1
- 4
Failed to load 'transformer_onnx_model'
#172 opened by rifkybujana - 3
Torch 2.0
#169 opened by varshith15 - 0
Tokenizer path
#171 opened by vishalsrao - 0
Marginal Improvement Between INT8 and FP16
#168 opened by alexriggio - 0
- 2
GPT-J 6B model
#146 opened by timofeev1995 - 0
sharing the code for the triton models.
#165 opened by espoirMur - 6
- 0
Can we support facebook/m2m100_418M model?
#164 opened by chi2liu - 0
ViT serving
#162 opened by VoVoR - 0
quick question about the attribute "model_name" in `BertForTokenClassification`
#161 opened by yc-wang00 - 0
- 0
- 0
Support for constrained beam-search in T5
#158 opened by junwang-wish - 0
Two GPU are slower than one
#156 opened by OleksandrKorovii - 1
Tensorrt engine
#155 opened by imsiddhant07 - 4
Unable to install transformer-deploy module
#149 opened by elvinagam - 0
- 1
OpusMT conversion
#143 opened by Matthieu-Tinycoaching - 1
Embedding with T5-Encoder
#153 opened by DA-L3 - 0
- 3
encoder_hidden_states in the onnx inputs
#139 opened by pngmafia - 2
TypeError: unhashable type: 'slice'
#136 opened by pngmafia - 3
[Question] Converting models over 2GB
#138 opened by wkkautas - 1
Does the text generation instructions work for the ByT5 models? (byte-by-byte T5)?
#140 opened by NOT-HAL9000 - 1
- 5
Failed to load private model
#132 opened by Matthieu-Tinycoaching - 3
- 1
- 3
- 1
Unable to pull docker image
#127 opened by brevity2021 - 0
- 3
- 5
How to Run With Polygraphy Graph Surgeon
#121 opened by sam-h-bean - 1
Question about generative model notebook
#123 opened by hyunwoongko