[Doc]: Failed to parse the arguments for the LLM constructor: _TrtLLM got invalid argument: disable_overlap_scheduler

Question

[Doc]: Failed to parse the arguments for the LLM constructor: _TrtLLM got invalid argument: disable_overlap_scheduler

Closed this issue 23 days ago · 5 comments

📚 The doc issue

CUDA_VISIBLE_DEVICES=0,1,2,3
trtllm-serve /mnt/model/DeepSeek-R1-Distill-Qwen-32B
--tp_size 4
--trust_remote_code
--kv_cache_free_gpu_memory_fraction 0.9
--host localhost --port 8001
--extra_llm_api_options ./ctx_extra-llm-api-config.yaml

[2025-09-16 03:59:51] INFO config.py:54: PyTorch version 2.8.0a0+5228986c39.nv25.5 available.
[2025-09-16 03:59:51] INFO config.py:66: Polars version 1.25.2 available.
2025-09-16 03:59:57,726 - INFO - flashinfer.jit: Prebuilt kernels not found, using JIT backend
[TensorRT-LLM] TensorRT-LLM version: 1.0.0rc4
[09/16/2025-03:59:59] [TRT-LLM] [E] Failed to parse the arguments for the LLM constructor: _TrtLLM got invalid argument: disable_overlap_scheduler
Traceback (most recent call last):
File "/usr/local/bin/trtllm-serve", line 8, in
sys.exit(main())
^^^^^^
File "/usr/local/lib/python3.12/dist-packages/click/core.py", line 1161, in call
return self.main(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/dist-packages/click/core.py", line 1082, in main
rv = self.invoke(ctx)
^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/dist-packages/click/core.py", line 1697, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/dist-packages/click/core.py", line 1443, in invoke
return ctx.invoke(self.callback, **ctx.params)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/dist-packages/click/core.py", line 788, in invoke
return __callback(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/dist-packages/tensorrt_llm/commands/serve.py", line 302, in serve
launch_server(host, port, llm_args, metadata_server_cfg, server_role)
File "/usr/local/lib/python3.12/dist-packages/tensorrt_llm/commands/serve.py", line 145, in launch_server
llm = LLM(**llm_args)
^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/dist-packages/tensorrt_llm/llmapi/llm.py", line 722, in init
super().init(model, tokenizer, tokenizer_mode, skip_tokenizer_init,
File "/usr/local/lib/python3.12/dist-packages/tensorrt_llm/llmapi/llm.py", line 160, in init
raise e
File "/usr/local/lib/python3.12/dist-packages/tensorrt_llm/llmapi/llm.py", line 141, in init
raise ValueError(
ValueError: _TrtLLM got invalid argument: disable_overlap_scheduler

Suggest a potential alternative/fix

No response

Before submitting a new issue...

Make sure you already searched for relevant issues, and checked the documentation and examples for answers to frequently asked questions.

Answer 1 · 2025-09-16T09:49:04.000Z

disable_overlap_scheduler is only available for the PyTorch backend. Can you check if you have backend: trt in your ctx_extra-llm-api-config.yaml?

Answer 2 · 2025-09-17T07:04:35.000Z

CUDA_VISIBLE_DEVICES=0,1,2,3 trtllm-serve /mnt/model/DeepSeek-R1-Distill-Qwen-32B --tp_size 4 --trust_remote_code --kv_cache_free_gpu_memory_fraction 0.9 --host localhost --port 8001 --extra_llm_api_options ./ctx_extra-llm-api-config.yaml --backend pytorch

adding "--backend pytorch" is OK

Answer 3 · 2025-09-17T07:05:26.000Z

ctx_extra-llm-api-config.yaml

#The overlap scheduler for context servers is currently disabled, as it is not yet supported in disaggregated context server architectures.
disable_overlap_scheduler: True
cache_transceiver_config:
backend: default
max_tokens_in_buffer: 2048

Answer 4 · 2025-10-02T03:13:41.000Z

Issue has not received an update in over 14 days. Adding stale label.

Answer 5 · 2025-10-16T03:19:14.000Z

This issue was closed because it has been 14 days without activity since it has been marked as stale.