run time

Question

run time

DayanaYuan opened this issue 8 months ago · 7 comments

It's been running all night with no results
def fetch_GPT_response(instruction, system_prompt, chat_model_id, chat_deployment_id, temperature=0):
print('Calling OpenAI...')
print("1.4\n")
response = openai.ChatCompletion.create(
temperature=temperature,
deployment_id=chat_deployment_id,
model=chat_model_id,
messages=[
{"role": "system", "content": system_prompt},
{"role": "user", "content": instruction}
]
)
print("1.5\n")
if 'choices' in response
and isinstance(response['choices'], list)
and len(response) >= 0
and 'message' in response['choices'][0]
and 'content' in response['choices'][0]['message']:
return response['choices'][0]['message']['content']
else:
return 'Unexpected response'

after print("1.4\n"), the operation is stuck. No result

Answer 1 · 2024-02-26T18:31:39.000Z

Hi @yangyangyang-github
This is a typical openai API call.
If the function is going in an endless loop, can you please double check your openai credentials that was provided in the config.yaml file?
Also, we have already put guardrails in the api call function to stop making calls after some sufficient time using retry module. Please refer here. Hence, if you are using the same functionality, it should not go to an endless loop.
Let me know how things turn out for you.

Answer 2 · 2024-02-27T02:51:12.000Z

Hi @yangyangyang-github This is a typical openai API call. If the function is going in an endless loop, can you please double check your openai credentials that was provided in the config.yaml file? Also, we have already put guardrails in the api call function to stop making calls after some sufficient time using retry module. Please refer here. Hence, if you are using the same functionality, it should not go to an endless loop. Let me know how things turn out for you.

I have added openai.api_key parameters to the file. What exactly is the openai credentials that was provided in the config.yaml file? May I have a look, please?

Answer 3 · 2024-02-28T05:38:28.000Z

You should have a file named '.gpt_config.env' and store it in your $HOME path. Content of the file should be in the following format:

API_KEY='openai api key'
API_VERSION='this is optional'
RESOURCE_ENDPOINT='this is optional'

Answer 4 · 2024-02-28T05:44:19.000Z

Isn't it convenient for you to release this file?

…

------------------ 原始邮件 ------------------ 发件人: "BaranziniLab/KG_RAG" ***@***.***>; 发送时间: 2024年2月28日(星期三) 中午1:38 ***@***.***>; ***@***.******@***.***>; 主题: Re: [BaranziniLab/KG_RAG] run time (Issue #18) You should have a file named '.gpt_config.env' and store it in your $HOME path. Content of the file should be in the following format: API_KEY= API_VERSION= RESOURCE_ENDPOINT= — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: ***@***.***>

Answer 5 · 2024-02-28T05:54:47.000Z

The file contains API credentials, which, like any other sensitive information, should ideally not be shared publicly. Hope you get it :)
Feel free to reach out if you need further assistance!

Answer 6 · 2024-03-01T02:07:26.000Z

For Llama coda, when run to llm = llama_model(MODEL_NAME, BRANCH_NAME, CACHE_DIR, stream=True, method=METHOD), model = AutoModelForCausalLM.from_pretrained(model_name, device_map='auto', torch_dtype=torch.float16, revision=branch_name, cache_dir=cache_dir), the program simply exits.

…

------------------ 原始邮件 ------------------ 发件人: "BaranziniLab/KG_RAG" ***@***.***>; 发送时间: 2024年2月28日(星期三) 中午1:54 ***@***.***>; ***@***.******@***.***>; 主题: Re: [BaranziniLab/KG_RAG] run time (Issue #18) The file contains API credentials, which, like any other sensitive information, should ideally not be shared publicly. Hope you get it :) Feel free to reach out if you need further assistance! — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: ***@***.***>

Answer 7 · 2024-03-01T04:00:26.000Z

@yangyangyang-github
Did you check if this is a memory issue? We are not using quantized versions of llama here, hence it could take a good chunk of memory. If you see here, you can see the size of the tensors for llama-13b and compare it with the memory of the machine that you are using.

I tried using llama-13b on p3.8x.large AWS instance which has following specs:
4 Tesla V100 GPU
64 GB GPU memory
32 vCPU
244 GB RAM