The Endpoint of OpenAI model is not allow to change, how should I add 3rd party LLMs?

Question

The Endpoint of OpenAI model is not allow to change, how should I add 3rd party LLMs?

freemank1224 opened this issue 10 months ago · 16 comments

I notice that I can modify some files in the src folder, but this still can't work when I use data_formulator. So can you provide the users the oppertunity to use 3rd party endpoints ?

Chittaranjans commented 7 months ago

OKay

Answer 1 · 2024-11-05T16:02:05.000Z

In order to be able to modify source code and reflect it in data formulator, you should follow the development build (instead of using pip install data_formulator).

You can check out the development build here https://github.com/microsoft/data-formulator/blob/main/DEVELOPMENT.md

You'll need to complete both backend build and frontend build, and visit http://localhost:3000/ to see the live version.

Some tips: if you made updates to backend, you'll need to restart the backend server with .\local_server.sh to reflect changes to the custom data formulator you are developing.

Answer 2 · 2025-02-10T13:37:47.000Z

Where in the app can i change the endpoint e.g. to use an OLLAMA model?

Answer 3 · 2025-02-10T14:31:25.000Z

i would love to see the Ollama as well as back end please

Answer 4 · 2025-02-10T18:14:56.000Z

Here is the place to update endpoint to use other models. https://github.com/microsoft/data-formulator/blob/main/py-src/data_formulator/agents/client_utils.py

There seems to be enough interests? We can update data formulator to support Ollama as part of the new version release.

Answer 5 · 2025-02-10T18:56:35.000Z

I would love to see a way to use other LLM. And possibly change the API URL. And also to use it locally with Ollama.

Answer 6 · 2025-02-10T19:11:06.000Z

Yeah, that sounds great. I also find Ollama a good addition given that open source models have much better performance nowadays.

Community -- for closed source models, if you all have suggestions on priorities, feel free to make suggestions. We could potentially start with claude models or hugging face keys.

Answer 7 · 2025-02-11T17:28:39.000Z

Azure OpenAI would be great (and Microsoft related 😊)

Answer 8 · 2025-02-11T18:12:46.000Z

To enable seamless integration with additional LLMs. we propose updating the LLM client initialization in client_utils.py to accept custom endpoints and model parameters using LiteLLM’s unified interface. Additionally, there introduce a user-friendly UI settings panel where users can input the Endpoint URL (e.g., http://localhost:11434 for Ollama), API Key and Model Name . This enhancement empowers users to switch between OpenAI, local models, or third-party providers.

export LLM_ENDPOINT="http://localhost:11434"(ollama)
defult:
export LLM_MODEL="llama2"
export LLM_API_KEY=""

Example LiteLLM configuration:

from litellm import completion 
import os

def get_llm_client():
    custom_endpoint = os.getenv("LLM_ENDPOINT", "https://api.openai.com/v1")  
    api_key = os.getenv("LLM_API_KEY", "")
    model = os.getenv("LLM_MODEL", "gpt-4")  # Default 

    if "ollama" in custom_endpoint.lower():
        model = f"ollama/{model}"  

    return completion(
        model=model,
        api_base=custom_endpoint,
        api_key=api_key,
        custom_llm_provider="ollama" if "ollama" in custom_endpoint else None
    )

Answer 9 · 2025-02-11T18:25:18.000Z

yes, this sounds like a very good approach. We'll test it out and do a PR soon.

Answer 10 · 2025-02-12T02:06:21.000Z

A work in progress supporting non-openai models is here: https://github.com/microsoft/data-formulator/tree/dev

I have tested with Ollama/OpenAI/AzureOpenAI, would do a bit more testing before doing a PR.

Checkout client support here: https://github.com/microsoft/data-formulator/blob/dev/py-src/data_formulator/agents/client_utils.py
Frontend update can be tracked at: https://github.com/microsoft/data-formulator/blob/dev/src/views/ModelSelectionDialog.tsx

Answer 11 · 2025-02-12T03:17:56.000Z

What i planned and tested , It work for all LLM , only the UI part for modelselection will have to do accordingly with the client_utils.py.

Answer 12 · 2025-02-12T23:47:35.000Z

Checkout this PR: #81, should work now! Will merge to main soon.

I have updated throughout (UI, agents, utils) so that Data Formulator can work with custom models.

So far by experience is that modes with good code generation and instruction following capabilities work best (gpt-4o, gpt-4o-mini, claude-3-5-sonnet etc).

Small local models (llama3.2, qwen2.5-coder:3b, codellama:7b) tends to ignore instructions to generate code, and can fail frequently on data formulation steps.

Answer 13 · 2025-02-13T00:33:15.000Z

You need to incorporate third-party LLMs without changing OpenAI’s fixed endpoint, you can use a middleware to route requests based on task requirements, call different LLMs selectively within your app, run local models if supported, or combine responses from multiple LLMs for richer output.

Answer 14 · 2025-02-13T00:37:25.000Z

Please check out the new release with custom model support: https://github.com/microsoft/data-formulator/releases/tag/0.1.5

@giyosphere made a good point. It would be good to batch some queries and adaptively choose model based on task requirement (different agent may have more suitable models). That would be a feature improvement.

Answer 15 · 2025-02-19T15:57:44.000Z

Hello,

At my company, we use OpenWebUI, which provides a chatbot GUI and exposes LLM APIs in both OpenAI and Ollama formats. Additionally, it allows us to secure these endpoints with an API key.

However, I’ve encountered issues configuring MS DataFormulator with both the Ollama and OpenAI endpoints:

The Ollama endpoint does not support setting an API key.
The OpenAI endpoint does not allow changing the base URL.

Would it be possible to either:

Enable API key support for the Ollama endpoint, or
Allow customization of the base URL for the OpenAI endpoint?

This would greatly help in integrating MS DataFormulator with OpenWebUI.

Thank you for your consideration!

Best regards,

Checkout this PR: #81, should work now! Will merge to main soon.

I have updated throughout (UI, agents, utils) so that Data Formulator can work with custom models.

So far by experience is that modes with good code generation and instruction following capabilities work best (gpt-4o, gpt-4o-mini, claude-3-5-sonnet etc).

Small local models (llama3.2, qwen2.5-coder:3b, codellama:7b) tends to ignore instructions to generate code, and can fail frequently on data formulation steps.