llm-rag-chatbot - 02-Deploy-RAG-Chatbot-Model
Opened this issue · 4 comments
In the llm-rag-chatbot demo, since the pay-per-token foundation models are not available in my region, I had to create my own embedding endpoint. With some minor code changes i was able to save the chunk embedding in the delta table.
In the second notebook (02-Deploy-RAG-Chatbot-Model) however, I need to define the DatabricksEmbedding:
embedding_model = DatabricksEmbeddings(endpoint="name of my endpoint")
print(f"Test embeddings: {embedding_model.embed_query('What is Apache Spark?')[:20]}...")
The first line runs fine, but the second generated the following error:
HTTPError: 400 Client Error: Bad Request for url: https://westeurope-c2.azuredatabricks.net/serving-endpoints/Try/invocations. Response text: {"error_code": "BAD_REQUEST", "message": "Invalid input. The input must be a JSON dictionary with exactly one of the input fields {'instances', 'dataframe_records', 'dataframe_split', 'inputs'}.. Received dictionary with input fields: set()."}
I wasn't able to find a workaround for this, esplicially because the embedding_model
is later used in the DatabricksVectorSearch
retriever.
Note: "Try" is the name of my endpoint
Has anyone experienced the same issue?
the easiest way is to check what's your own model embedding is expecting. Depending on how you created it, you can check its signature from the embedding endpoint page or when you saved the embedding?
Thanks for the answer!
yes, I also though about checking what type of input my endpoint is expecting and it looks like this:
so teoretically this shouldn't throw any error:
input = str({
"dataframe_split": {
"data": [
[
"This is an example sentence",
"Each sentence is converted"
]
]
}
})
print(f"Test embeddings: {embedding_model.embed_query(input)}...")
but it throws the same error I showed before. I also tried with different variations of the input but without luck. Do you have any suggestions?
Let's suppose I find the input structure that my embedding endpoint requires; when I initialize the DatabricksVectorSearch retriever like this:
vectorstore = DatabricksVectorSearch(
vs_index, text_column="content", embedding=embedding_model
)
and later use it, am I supposed to use it like this?
input = <"How do I track my Databricks Billing?" inside the required structure>
vectorstore.get_relevant_documents(input)
Thanks again
Hey,
I think it's not working cause you're sending a dataframe split and you're calling it with the MLFlow API.
Can you try with something like this instead? (I might be wrong in the data
, might have added an extra []
, give it a try? )
from databricks.sdk import WorkspaceClient
from databricks.sdk.service.serving import DataframeSplitInput
df_split = DataframeSplitInput(columns=["text"],
data=[[ {"text": ["test", "test2"]}]])
w = WorkspaceClient()
w.serving_endpoints.query(serving_endpoint_name, dataframe_split=df_split)
Both these two blocks:
from databricks.sdk import WorkspaceClient
from databricks.sdk.service.serving import DataframeSplitInput
df_split = DataframeSplitInput(columns=["text"],
data=[[ {"text": ["test", "test2"]}]])
w = WorkspaceClient()
w.serving_endpoints.query("Try", dataframe_split=df_split)
from databricks.sdk import WorkspaceClient
from databricks.sdk.service.serving import DataframeSplitInput
df_split = DataframeSplitInput(columns=["text"],
data=[ {"text": ["test", "test2"]}])
w = WorkspaceClient()
w.serving_endpoints.query("Try", dataframe_split=df_split)
are throwing the same error:
DatabricksError: Encountered an unexpected error while evaluating the model. Verify that the input is compatible with the model for inference. Error '0'
Just to be a bit more clear.
I can query the endpoint and get the embeddings:
client = get_deploy_client("databricks")
response = client.predict(endpoint="Try", inputs={"inputs": ["What is Apache Spark?"]})
response then looks like this:
{'predictions': [[0.01854606904089451,
-0.014122228138148785,
-0.05753036588430405,
0.003444875590503216,
0.008468957617878914,
-0.02167639322578907,
-0.024665052071213722,
-0.004707943182438612,
0.013623288832604885,
0.05044160038232803,
-0.0272558331489563,
-0.014775500632822514,
0.05472960323095322,
-0.05385893955826759,
-0.010328208096325397,
-0.01619352027773857,
-0.018818814307451248,
-0.017301389947533607,
-0.05125604569911957,
0.017826739698648453,
0.004365426488220692,
0.028414759784936905,
-0.05563770979642868,
-0.03768272325396538,
-0.0013761025620624423,
0.020363593474030495,
...
However, later on down the notebook, I need to run:
embedding_model = DatabricksEmbeddings(endpoint="Try")
def get_retriever(persist_dir: str = None):
os.environ["DATABRICKS_HOST"] = host
#Get the vector search index
vsc = VectorSearchClient(workspace_url=host, personal_access_token=os.environ["DATABRICKS_TOKEN"])
vs_index = vsc.get_index(
endpoint_name=VECTOR_SEARCH_ENDPOINT_NAME,
index_name=index_name
)
# Create the retriever
vectorstore = DatabricksVectorSearch(
vs_index, text_column="content", embedding=embedding_model
)
return vectorstore.as_retriever()
vectorstore = get_retriever()
which is failing because of the embedding_model:
HTTPError: 400 Client Error: Bad Request for url: https://westeurope-c2.azuredatabricks.net/serving-endpoints/Try/invocations. Response text: {"error_code": "BAD_REQUEST", "message": "Invalid input. The input must be a JSON dictionary with exactly one of the input fields {'dataframe_records', 'instances', 'dataframe_split', 'inputs'}.. Received dictionary with input fields: set()."}
and I don't know how to modify the block above to make it work.