How to use Tensorrt-LLM as backend
Worromots opened this issue · 4 comments
Worromots commented
as describe in title
helloyongyang commented
You can set save_fp in llmc to True. Then you can use trt-llm ammo to convert a naive quant engine.
Worromots commented
Worromots commented
remark,I need your help
Harahan commented
The following process needs to modify some codes to change the default settings in TensorRT-LLM. To help users use our tool more conveniently, we are rushing an official doc page about the tool. Please wait for our news patient.