Chat with onnxruntime-web fails when i deployed on my machine

Question

Chat with onnxruntime-web fails when i deployed on my machine

salimngit opened this issue 6 months ago · 1 comments

Chat with onnxruntime-web example deployed with

"onnxruntime-web": "1.19.0-dev.20240509-69cfcba38a"
and associated wasm files from https://cdn.jsdelivr.net/npm/onnxruntime-web@1.19.0-dev.20240509-69cfcba38a/dist/

live demo https://guschmue.github.io/ort-webgpu/chat/index.html works just fine.

When i run locally, i receive the below error

Error: found infinitive in logits at LLM.argmax (llm.js:152:1) at LLM.generate (llm.js:210:1) at async Query (main.js:233:1)

| (anonymous) | @ | main.js:141 -- | -- | -- | -- | Promise.catch (async) | | | submitRequest | @ | main.js:140 | (anonymous) | @ | main.js:167

Answer 1 · 2024-06-27T13:55:20.000Z

@salimngit, it looks like you're encountering an error related to the presence of infinite values in the logits during the inference process with onnxruntime-web. This can happen due to various reasons, including model issues, input data anomalies, or bugs in the runtime.

Here are a few steps you can take to troubleshoot and resolve this issue:

Check Input Data:
- Ensure that the input data fed to the model is properly preprocessed and normalized. Incorrect input data can lead to invalid computations resulting in infinite values.
Model Integrity:
- Verify that the model file is not corrupted. Re-download the model if necessary and ensure that it is correctly loaded in your code.
Runtime Version:
- Make sure you are using the correct version of onnxruntime-web and associated WASM files. Since you mentioned using a specific development version, confirm that the paths to the WASM files are correct and they are accessible.
Check for Known Issues:
- Look for any known issues or bugs reported with the version you are using. Sometimes, development versions might have unresolved bugs that could be causing the problem.
Error Handling:
- Add additional error handling in your code to catch and log the specific values causing the issue. This can help pinpoint the exact cause of the infinite values.

Here is an example of how you might add error handling to check for infinite values in logits:

async function generateResponse(input) {
    try {
        const logits = await LLM.generate(input);
        
        // Check for infinite values in logits
        if (logits.some(value => !isFinite(value))) {
            throw new Error("Found infinite value in logits");
        }
        
        const response = LLM.argmax(logits);
        return response;
    } catch (error) {
        console.error("Error in generateResponse:", error);
        // Handle the error appropriately
    }
}

async function submitRequest(input) {
    try {
        const response = await generateResponse(input);
        // Process the response
    } catch (error) {
        console.error("Error in submitRequest:", error);
        // Handle the error appropriately
    }
}

Verify Local Setup:

Ensure all dependencies are correctly installed.
Check the console logs for any additional warnings or errors that might give more insight into the problem.

Compare with Working Demo:

Compare your local setup with the live demo setup to identify any configuration differences. Ensure all settings, versions, and paths match the working demo.

Debugging Tips:

Log intermediate values to understand where the infinite values are introduced.
Simplify your input and model to isolate the issue, starting with basic inputs and gradually increasing complexity.

By following these steps, you should be able to identify and resolve the issue with infinite values in the logits during model inference.