支持在同一个模型服务中配置多个来负载均衡
Closed this issue · 3 comments
XYZliang commented
支持给一个模型服务credentials配置多个 id key,然后请求时模型设为 random-model名称,例如 random-hunyuan-lite,使其在hunyuan-lite配置的多个api_key之间进行随机或轮训选择,来提高qps 等限制。这对网页翻译等短时间大量请求的会有很大帮助
fruitbars commented
目前支持的哈,例如客户端选择spark-lite,可以按照下面这样配置,会随机credentials
{
"api_key":"123456",
"load_balancing": "random",
"xinghuo": [
{
"models": ["spark-lite"],
"enabled": true,
"credentials": {
"appid": "xxx",
"api_key": "xxx",
"api_secret": "xxx"
}
},
{
"models": ["spark-lite"],
"enabled": true,
"credentials": {
"appid": "xxx",
"api_key": "xxx",
"api_secret": "xxx"
}
}
]
}
XYZliang commented
感谢,那差不多非常完善了。oneapi 太臃肿了哈哈
XYZliang commented
感谢,那差不多非常完善了。oneapi 太臃肿了哈哈
Sent with Spark
在 2024年5月29日 +0800 13:51,fruitbars ***@***.***>,写道:
… 目前支持的哈,例如客户端选择spark-lite,可以按照下面这样配置,会随机credentials
{
"api_key":"123456",
"load_balancing": "random",
"xinghuo": [
{
"models": ["spark-lite"],
"enabled": true,
"credentials": {
"appid": "xxx",
"api_key": "xxx",
"api_secret": "xxx"
}
},
{
"models": ["spark-lite"],
"enabled": true,
"credentials": {
"appid": "xxx",
"api_key": "xxx",
"api_secret": "xxx"
}
}
]
}
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you authored the thread.Message ID: ***@***.***>