The `send_message_streaming` function does not raise an error when the maximum context length is exceeded.
Closed this issue · 2 comments
Hi!
If a message exceeds the maximum allowed context length for the conversation, the standard send_message
function will raise an error, e.g.,:
Error: BackendError { message: "This model's maximum context length is 4097 tokens. However, your messages resulted in 5000 tokens. Please reduce the length of the messages.", error_type: "invalid_request_error" }
However, when the same message is sent using the send_message_streaming
function, no error is raised. Instead, the conversation becomes permanently stuck.
Is there a way to work around this issue?
Thanks!
Does the underlying API even return any errors when streaming a response? It seems that whenever any error occurs, the server just stops sending message parts whatsoever.
It seems that whenever the message is too long the server answer with a status=400
, that could be intercepted before calling bytes_stream()
.
let response_stream = self
.client
.post(self.config.api_url.clone())
.json(&CompletionRequest {
model: self.config.engine.as_ref(),
stream: true,
messages: history,
temperature: self.config.temperature,
top_p: self.config.top_p,
frequency_penalty: self.config.frequency_penalty,
presence_penalty: self.config.presence_penalty,
reply_count: self.config.reply_count,
})
.send()
.await?;
if response_stream.status() == 400 {
return Err(crate::err::Error::ParsingError("Is your message too large?".to_string()));
}
let response_stream = response_stream
.bytes_stream()
.eventsource();
Original code at https://github.com/Maxuss/chatgpt_rs/blob/master/src/client.rs#L183
(Note: In this code snippet, I utilized the existing ParsingError
because it allowed me to pass a string. However, it might be more appropriate to define a new error specifically for this scenario.)