empirical-run/empirical

Model latency numbers include time taken to retry

saikatmitra91 opened this issue ยท 2 comments

๐Ÿ“ Description

Since the model calls are wrapped by SDK and the SDK internally retries, the latency time calculation includes the total time taken to get the response and not the time taken by the final request to resolve.

๐Ÿ“ธ Screenshots / Code Snippets

Screenshot 2024-04-21 at 11 41 36โ€ฏAM

๐Ÿ›  Proposed Solution

  • reset the timer when doing a retry

@saikatmitra91 I would like to work on this, can you please assign this to me?

Fixed with #216. Closing.