Issue with OpenAI Encoder
iamyihwa opened this issue · 2 comments
iamyihwa commented
Hello @koaning Thanks for the great package!
Was trying on the openai embedding
Got some error,
first fixed it by loading instead of CohereEncoder, OpenAIEncoder in line 6. (from embetter.external import OpenAIEncoder)
However still getting an error that says openai not defined
I did assign openai.api_key in the code. Not the organization code though, since from openai page, it didn't give me one.
Code
import pandas as pd
from sklearn.pipeline import make_pipeline
from sklearn.linear_model import LogisticRegression
from embetter.grab import ColumnGrabber
from embetter.external import OpenAIEncoder
import openai
# You must run this first!
#openai.organization = OPENAI_ORG
openai.api_key = 'MY_OWN_KEY'
# Let's suppose this is the input dataframe
dataf = pd.DataFrame({
"text": ["positive sentiment", "super negative"],
"label_col": ["pos", "neg"]
})
# This pipeline grabs the `text` column from a dataframe
# which then get fed into Cohere's endpoint
text_emb_pipeline = make_pipeline(
ColumnGrabber("text"),
OpenAIEncoder()
)
X = text_emb_pipeline.fit_transform(dataf, dataf['label_col'])
# This pipeline can also be trained to make predictions, using
# the embedded features.
text_clf_pipeline = make_pipeline(
text_emb_pipeline,
LogisticRegression()
)
# Prediction example
text_clf_pipeline.fit(dataf, dataf['label_col']).predict(dataf)
Error message:
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
~\AppData\Local\Temp\1\ipykernel_24268\2450760316.py in <module>
10 OpenAIEncoder()
11 )
---> 12 X = text_emb_pipeline.fit_transform(dataf, dataf['label_col'])
13
14 # This pipeline can also be trained to make predictions, using
~\Miniconda3\envs\SemanticMatching\lib\site-packages\sklearn\pipeline.py in fit_transform(self, X, y, **fit_params)
432 fit_params_last_step = fit_params_steps[self.steps[-1][0]]
433 if hasattr(last_step, "fit_transform"):
--> 434 return last_step.fit_transform(Xt, y, **fit_params_last_step)
435 else:
436 return last_step.fit(Xt, y, **fit_params_last_step).transform(Xt)
~\Miniconda3\envs\SemanticMatching\lib\site-packages\sklearn\base.py in fit_transform(self, X, y, **fit_params)
853 else:
854 # fit method of arity 2 (supervised transformation)
--> 855 return self.fit(X, y, **fit_params).transform(X)
856
857
~\Miniconda3\envs\SemanticMatching\lib\site-packages\embetter\external\_openai.py in transform(self, X, y)
79 result = []
80 for b in _batch(X, self.batch_size):
---> 81 resp = openai.Embedding.create(input=X, model=self.model) # fmt: off
82 result.extend([_["embedding"] for _ in resp["data"]])
83 return np.array(result)
NameError: name 'openai' is not defined
koaning commented
Ah. It seems that openai must be imported in that file in order for it to work? That somewhat surprises me but I'll dive in and have a look.